Ajout de l'évaluateur, de mon côté le score est de 2/71... car la plupar des réponses sont: ⚠️ Limite de requêtes atteinte. Attendez 1 minute et réessayez. Il faudrait essayer avec un modèle acceptant plus de requêtes, éventuellement en local.
This commit is contained in:
357
data/eval/sample_questions.json
Normal file
357
data/eval/sample_questions.json
Normal file
@ -0,0 +1,357 @@
|
||||
[
|
||||
{
|
||||
"question": "What is the name of the company?",
|
||||
"answer": "The company is named TShirtCo, a fictional enterprise established with a focus on personalized t-shirt production. This name reflects its core business of designing and selling custom apparel, aligning with its mission to offer unique, ethically produced clothing to its customers as of November 13, 2025.",
|
||||
"expected_answer": "TShirtCo"
|
||||
},
|
||||
{
|
||||
"question": "Where is TShirtCo located?",
|
||||
"answer": "TShirtCo is located at 123 Rue de Paris, 75000 Paris, France. This central address in Paris, a major cultural and commercial hub, was strategically chosen to enhance visibility and accessibility, supporting its growth since January 1, 2020, up to today, November 13, 2025.",
|
||||
"expected_answer": "123 Rue de Paris, 75000 Paris, France"
|
||||
},
|
||||
{
|
||||
"question": "What is the SIRET number of TShirtCo?",
|
||||
"answer": "The SIRET number of TShirtCo is 123 456 789 00012. This unique identifier, issued by the French government, confirms its legal registration since 2020, essential for tax and business operations as of November 13, 2025.",
|
||||
"expected_answer": "123 456 789 00012"
|
||||
},
|
||||
{
|
||||
"question": "Who is the director of TShirtCo?",
|
||||
"answer": "The director of TShirtCo is John Doe, who has led the company since its founding on January 1, 2020. As the visionary behind its ethical production focus, he continues to steer its strategic direction as of November 13, 2025.",
|
||||
"expected_answer": "John Doe"
|
||||
},
|
||||
{
|
||||
"question": "What is the capital social of TShirtCo?",
|
||||
"answer": "The capital social of TShirtCo is 50,000 €. This initial investment, established at its creation, has funded equipment and inventory, providing a solid financial foundation for its operations over the past nearly six years as of late 2025.",
|
||||
"expected_answer": "50 000 €"
|
||||
},
|
||||
{
|
||||
"question": "What is the price of a red T-shirt with Motif1 in size M?",
|
||||
"answer": "The price of a red T-shirt with Motif1 in size M is 18.5 €. This base cost reflects French manufacturing and Polish cotton, offering a competitive rate for a personalized, ethically produced item, available in stock as of November 13, 2025.",
|
||||
"expected_answer": "18.5 €"
|
||||
},
|
||||
{
|
||||
"question": "Where is the red T-shirt with Motif1 manufactured?",
|
||||
"answer": "The red T-shirt with Motif1 is manufactured in France. This local production supports TShirtCo's ethical standards, reducing carbon footprint and boosting the French textile industry, a practice upheld since 2020 and ongoing in 2025.",
|
||||
"expected_answer": "France"
|
||||
},
|
||||
{
|
||||
"question": "How many red T-shirts with Motif1 are in stock?",
|
||||
"answer": "There are 148 red T-shirts with Motif1 currently in stock. This inventory, spread across sizes S, M, L, and XL, ensures TShirtCo can meet demand as of November 13, 2025, though levels may adjust with new orders.",
|
||||
"expected_answer": "148"
|
||||
},
|
||||
{
|
||||
"question": "What is the total stock of all T-shirts across all colors and motifs?",
|
||||
"answer": "The total stock of all T-shirts across all colors and motifs exceeds 1,000 units, with individual stocks like 148 for red Motif1, 160 for blue Motif8, and 196 for black Motif5. This aggregate, based on the catalog as of November 2025, reflects a robust inventory managed by TShirtCo.",
|
||||
"expected_answer": ">1000 (approximate)"
|
||||
},
|
||||
{
|
||||
"question": "What is the net profit for TShirtCo in 2024?",
|
||||
"answer": "The net profit for TShirtCo in 2024 is 19,040 €. This figure, derived from a 185,400 € turnover minus expenses and a 3,360 € tax, indicates a healthy financial year ending December 31, 2024, setting a strong base for 2025 growth.",
|
||||
"expected_answer": "19 040 €"
|
||||
},
|
||||
{
|
||||
"question": "How many employees does TShirtCo have?",
|
||||
"answer": "TShirtCo currently employs 8 individuals as of November 13, 2025. This team includes full-time staff like John Doe and part-time workers like Tom Lemoine, supporting its production and sales operations since its expansion in 2024.",
|
||||
"expected_answer": "8"
|
||||
},
|
||||
{
|
||||
"question": "What is the salary of Claire Moreau?",
|
||||
"answer": "Claire Moreau, the Responsable production, earns a gross monthly salary of 3,200 €. This compensation, set since her hiring on February 15, 2020, reflects her critical role in overseeing TShirtCo's manufacturing as of late 2025.",
|
||||
"expected_answer": "3 200 €"
|
||||
},
|
||||
{
|
||||
"question": "When was Ahmed Benali hired?",
|
||||
"answer": "Ahmed Benali was hired on June 10, 2021. His role as an Opérateur sérigraphie has been pivotal in TShirtCo's personalized t-shirt production, contributing to its growth over the past four years as of November 13, 2025.",
|
||||
"expected_answer": "10/06/2021"
|
||||
},
|
||||
{
|
||||
"question": "What is the client name for order CMD-2024-001?",
|
||||
"answer": "The client name for order CMD-2024-001 is Marie Dupont. She placed this order on March 15, 2024, for red T-shirts with Motif1 and blue T-shirts with Motif5, which was delivered, marking an early success for TShirtCo.",
|
||||
"expected_answer": "Marie Dupont"
|
||||
},
|
||||
{
|
||||
"question": "What is the status of order CMD-2025-003?",
|
||||
"answer": "The status of order CMD-2025-003 is 'Annulée' (Cancelled). Placed by Tom Durand on March 5, 2025, this order was canceled, possibly due to a change in plans, as recorded in TShirtCo's order history as of today.",
|
||||
"expected_answer": "Annulée"
|
||||
},
|
||||
{
|
||||
"question": "What is the total amount TTC for invoice FAC-2025-042?",
|
||||
"answer": "The total amount TTC for invoice FAC-2025-042 is 81.12 €. Issued on March 14, 2025, to Léa Moreau, this includes t-shirts and shipping with 20% VAT, reflecting a standard transaction completed by late 2025.",
|
||||
"expected_answer": "81.12 €"
|
||||
},
|
||||
{
|
||||
"question": "What is the main product supplied by TShirtFactory?",
|
||||
"answer": "The main product supplied by TShirtFactory is t-shirts, specifically blancs (white) and colorés (colored). As TShirtCo's primary supplier since 2020, it provides the base garments for personalization, ensuring a consistent production flow as of November 2025.",
|
||||
"expected_answer": "T-shirts blancs & colorés"
|
||||
},
|
||||
{
|
||||
"question": "What is the stock of green T-shirt with Motif2 in size S?",
|
||||
"answer": "The stock of green T-shirts with Motif2 in size S is 50 units. This portion of the 190-unit total stock indicates popularity for this size, potentially requiring reordering if demand persists into late 2025.",
|
||||
"expected_answer": "50"
|
||||
},
|
||||
{
|
||||
"question": "What is the total payroll cost in 2024?",
|
||||
"answer": "The total payroll cost in 2024 is 38,000 €. This covers salaries for all 8 employees, including John Doe's 5,500 €/month and Tom Lemoine's 800 €/month, reflecting a significant operational expense for the year.",
|
||||
"expected_answer": "38 000 €"
|
||||
},
|
||||
{
|
||||
"question": "What is the date of the financial report?",
|
||||
"answer": "The date of the financial report is December 31, 2024. This marks the end of TShirtCo's fiscal year, providing a detailed overview of its 126,320 € in assets and 19,040 € net profit, crucial for 2025 planning.",
|
||||
"expected_answer": "31/12/2024"
|
||||
},
|
||||
{
|
||||
"question": "What is the total amount HT for CMD-2025-002?",
|
||||
"answer": "The total amount HT for CMD-2025-002 is 2,090.00 €. This covers 100 blue T-shirts with Motif8 ordered by Entreprise TechWear on February 25, 2025, before VAT, indicating a significant bulk transaction.",
|
||||
"expected_answer": "2090.00 €"
|
||||
},
|
||||
{
|
||||
"question": "What is the grammage of a violet T-shirt with Motif1?",
|
||||
"answer": "The grammage of a violet T-shirt with Motif1 is 184 g/m². This medium-weight fabric, paired with German cotton, offers durability and comfort, aligning with TShirtCo's quality standards as of November 2025.",
|
||||
"expected_answer": "184 g/m²"
|
||||
},
|
||||
{
|
||||
"question": "Based on the 2024 turnover and expenses, what might be the projected profit for 2025 if sales increase by 10%?",
|
||||
"answer": "Based on a 2024 turnover of 185,400 € and expenses totaling 163,000 € (including 98,200 € purchases, 28,500 € charges, 38,000 € salaries), the 2024 profit was 22,400 € before tax. A 10% sales increase to 203,940 €, with expenses rising proportionally to 179,300 €, suggests a pre-tax profit of approximately 24,640 €, reduced by a 15% tax (3,696 €) to a net profit of about 20,944 € for 2025, assuming stable conditions.",
|
||||
"expected_answer": "Approx. 20 944 €"
|
||||
},
|
||||
{
|
||||
"question": "If CMD-2025-002 is delayed by one week, how might this affect the stock of blue T-shirts with Motif8?",
|
||||
"answer": "A one-week delay in CMD-2025-002, originally planned for March 24, 2025, would shift production past the current date of November 13, 2025. With 160 units in stock and 100 ordered, the delay might reduce available stock to 60 units if demand outpaces production, potentially impacting customer satisfaction unless reordering is expedited.",
|
||||
"expected_answer": "Reduces available stock to 60 units"
|
||||
},
|
||||
{
|
||||
"question": "What should TShirtCo do if the cotton provenance for a new motif is unknown?",
|
||||
"answer": "If the cotton provenance for a new motif is unknown, TShirtCo should halt production until verified, contact suppliers like CotonBio Europe for clarification, and conduct an audit to ensure ethical sourcing compliance. This proactive approach protects the brand's reputation and aligns with its ethical standards as of late 2025.",
|
||||
"expected_answer": "Halt production, verify with suppliers, conduct audit"
|
||||
},
|
||||
{
|
||||
"question": "What is the stock of a non-existent motif like Motif11?",
|
||||
"answer": "Since Motif11 does not exist in TShirtCo's catalog as of November 13, 2025, the stock is 0. The model should recognize this as an error case and return a null or zero value, indicating no available data, rather than fabricating a response.",
|
||||
"expected_answer": "0"
|
||||
},
|
||||
{
|
||||
"question": "Calculate the average price of all red T-shirts listed.",
|
||||
"answer": "The red T-shirts listed include prices of 18.5 € (Motif1), 21.5 € (Motif2), 21.6 € (Motif3), 21.4 € (Motif4), 18.5 € (Motif5), 21.5 € (Motif6), 19.8 € (Motif7), 20.1 € (Motif8), 20.1 € (Motif9), and 19.5 € (Motif10). Averaging these 10 prices (199.0 € total) yields approximately 19.9 €, reflecting a consistent pricing strategy as of late 2025.",
|
||||
"expected_answer": "Approx. 19.9 €"
|
||||
},
|
||||
{
|
||||
"question": "Is CMD-2025-005 deliverable by November 13, 2025, given the current production schedule?",
|
||||
"answer": "CMD-2025-005, placed on March 28, 2025, with production scheduled for March 25-26, 2025, should be deliverable by now, November 13, 2025, assuming a standard 2-3 week shipping period. The status 'Planifiée' suggests it was on track, likely completed unless delays occurred.",
|
||||
"expected_answer": "Yes"
|
||||
},
|
||||
{
|
||||
"question": "What is the provenance of the cotton for the blue T-shirt with Motif10?",
|
||||
"answer": "The provenance of the cotton for the blue T-shirt with Motif10 is Switzerland, labeled as 'Coton de Suisse.' This premium source enhances quality, supporting TShirtCo's ethical standards as of November 13, 2025.",
|
||||
"expected_answer": "Coton de Suisse"
|
||||
},
|
||||
{
|
||||
"question": "How many t-shirts were ordered in CMD-2025-001?",
|
||||
"answer": "A total of 3 t-shirts were ordered in CMD-2025-001, placed by Sophie Leroy on January 12, 2025, including 2 gray T-shirts with Motif1 and 1 white T-shirt with Motif3, all successfully paid for as of late 2025.",
|
||||
"expected_answer": "3"
|
||||
},
|
||||
{
|
||||
"question": "What is the status of order CMD-2024-002?",
|
||||
"answer": "The status of order CMD-2024-002 is 'Livrée' (Delivered). Placed by Lucas Martin on April 22, 2024, for 5 t-shirts, this order was fulfilled, contributing to TShirtCo's early sales success as recorded by November 2025.",
|
||||
"expected_answer": "Livrée"
|
||||
},
|
||||
{
|
||||
"question": "What is the total amount TTC for CMD-2024-004?",
|
||||
"answer": "The total amount TTC for CMD-2024-004 is 1,146.00 €. This covers 50 yellow T-shirts with Motif3 ordered by Association EcoPrint on June 3, 2024, with 20% VAT on 955.00 € HT, reflecting a significant bulk purchase.",
|
||||
"expected_answer": "1146.00 €"
|
||||
},
|
||||
{
|
||||
"question": "What is the place of fabrication for the gray T-shirt with Motif10?",
|
||||
"answer": "The gray T-shirt with Motif10 is fabricated in France. This local production ensures quality control and ethical standards, using French cotton, a practice consistent with TShirtCo's operations as of November 2025.",
|
||||
"expected_answer": "France"
|
||||
},
|
||||
{
|
||||
"question": "How many employees were hired in 2024?",
|
||||
"answer": "Two employees were hired in 2024: Sarah Klein on November 12, 2023 (effective into 2024) and Tom Lemoine on September 1, 2024. This expansion reflects TShirtCo's growth as of November 13, 2025.",
|
||||
"expected_answer": "2"
|
||||
},
|
||||
{
|
||||
"question": "What is the total stock of black T-shirts with Motif1?",
|
||||
"answer": "The total stock of black T-shirts with Motif1 is 141 units. This inventory, across sizes S, M, L, and XL, supports demand with Czech Republic cotton, managed effectively as of late 2025.",
|
||||
"expected_answer": "141"
|
||||
},
|
||||
{
|
||||
"question": "What is the grammage of a blue T-shirt with Motif2?",
|
||||
"answer": "The grammage of a blue T-shirt with Motif2 is 181 g/m². This medium-weight fabric, with Belgian cotton, offers comfort and durability, aligning with TShirtCo's quality focus as of November 2025.",
|
||||
"expected_answer": "181 g/m²"
|
||||
},
|
||||
{
|
||||
"question": "What is the client email for CMD-2024-005?",
|
||||
"answer": "The client email for CMD-2024-005 is paul.richard@yahoo.com. This personal email, linked to Paul Richard's order on July 18, 2024, facilitated a successful delivery as recorded by late 2025.",
|
||||
"expected_answer": "paul.richard@yahoo.com"
|
||||
},
|
||||
{
|
||||
"question": "What is the total amount TTC for CMD-2025-002?",
|
||||
"answer": "The total amount TTC for CMD-2025-002 is 2,508.00 €. This includes 20% VAT on 2,090.00 € HT for 100 blue T-shirts with Motif8, ordered by Entreprise TechWear on February 25, 2025.",
|
||||
"expected_answer": "2508.00 €"
|
||||
},
|
||||
{
|
||||
"question": "What is the stock of orange T-shirt with Motif5 in size XL?",
|
||||
"answer": "The stock of orange T-shirts with Motif5 in size XL is 8 units. This limited quantity, part of 120 total units, suggests high demand, potentially requiring a reorder as of November 13, 2025.",
|
||||
"expected_answer": "8"
|
||||
},
|
||||
{
|
||||
"question": "What is the provenance of the cotton for the yellow T-shirt with Motif9?",
|
||||
"answer": "The provenance of the cotton for the yellow T-shirt with Motif9 is Norway, labeled as 'Coton de Norvège.' This premium source enhances quality, supporting TShirtCo's ethical reputation as of late 2025.",
|
||||
"expected_answer": "Coton de Norvège"
|
||||
},
|
||||
{
|
||||
"question": "How many t-shirts are in the reordering on 27/03/2025?",
|
||||
"answer": "A total of 50 t-shirts are included in the reordering on March 27, 2025. These red T-shirts with Motif1, overseen by Claire Moreau, aim to replenish stock across all sizes, ensuring availability into late 2025.",
|
||||
"expected_answer": "50"
|
||||
},
|
||||
{
|
||||
"question": "What is the salary of Marc Duval?",
|
||||
"answer": "The salary of Marc Duval, the Logisticien, is 2,400 € gross per month. This wage, set since April 1, 2023, reflects his role in managing inventory and shipping, including recent dispatches as of November 2025.",
|
||||
"expected_answer": "2 400 €"
|
||||
},
|
||||
{
|
||||
"question": "What is the total amount of taxes paid in 2024?",
|
||||
"answer": "The total amount of taxes paid in 2024 is 3,360 €. This 15% corporate tax on the 22,400 € operating profit impacts the net profit of 19,040 €, a key financial detail for the year ending December 31, 2024.",
|
||||
"expected_answer": "3 360 €"
|
||||
},
|
||||
{
|
||||
"question": "What is the stock of white T-shirt with Motif7 in size XL?",
|
||||
"answer": "The stock of white T-shirts with Motif7 in size XL is 9 units. This small portion of the 157-unit total suggests high demand, potentially necessitating a reorder to maintain stock levels as of late 2025.",
|
||||
"expected_answer": "9"
|
||||
},
|
||||
{
|
||||
"question": "What is the price of an orange T-shirt with Motif8 in size XL?",
|
||||
"answer": "The price of an orange T-shirt with Motif8 in size XL is 20.8 €. This price accounts for additional material and Polish cotton, offering a premium yet affordable option in TShirtCo's catalog as of November 2025.",
|
||||
"expected_answer": "20.8 €"
|
||||
},
|
||||
{
|
||||
"question": "What is the status of the production on 24/03/2025?",
|
||||
"answer": "The status of production on March 24, 2025, is 'En cours' (In progress). Ahmed Benali was working on 100 blue T-shirts with Motif8 for CMD-2025-002, a significant order, with completion likely by now, November 13, 2025.",
|
||||
"expected_answer": "En cours"
|
||||
},
|
||||
{
|
||||
"question": "What is the total amount HT for CMD-2025-005?",
|
||||
"answer": "The total amount HT for CMD-2025-005 is 780.00 €. This covers 30 violet T-shirts with Motif7 and 20 orange T-shirts with Motif1, ordered by Collectif ArtVif on March 28, 2025, before VAT application.",
|
||||
"expected_answer": "780.00 €"
|
||||
},
|
||||
{
|
||||
"question": "What is the provenance of the cotton for the rose T-shirt with Motif1?",
|
||||
"answer": "The provenance of the cotton for the rose T-shirt with Motif1 is Austria, labeled as 'Coton de Autriche.' This high-quality source enhances the t-shirt's texture, aligning with TShirtCo's ethical sourcing as of late 2025.",
|
||||
"expected_answer": "Coton de Autriche"
|
||||
},
|
||||
{
|
||||
"question": "How many t-shirts were ordered in CMD-2024-002?",
|
||||
"answer": "A total of 5 t-shirts were ordered in CMD-2024-002. Placed by Lucas Martin on April 22, 2024, this included 3 green T-shirts with Motif2 and 2 white T-shirts with Motif7, all delivered successfully.",
|
||||
"expected_answer": "5"
|
||||
},
|
||||
{
|
||||
"question": "What is the contact person at Encres ÉcoPrint?",
|
||||
"answer": "The contact person at Encres ÉcoPrint is M. Klein. This individual manages TShirtCo's supply of ecological screen-printing inks, a key relationship supporting sustainable production as of November 2025.",
|
||||
"expected_answer": "M. Klein"
|
||||
},
|
||||
{
|
||||
"question": "What is the IBAN for TShirtCo payments?",
|
||||
"answer": "The IBAN for TShirtCo payments is FR76 1234 5678 9000 1234 5678 901. This account facilitates secure transactions, such as the payment for FAC-2025-042, and is critical for financial operations as of late 2025.",
|
||||
"expected_answer": "FR76 1234 5678 9000 1234 5678 901"
|
||||
},
|
||||
{
|
||||
"question": "What is the payment term with CotonBio Europe?",
|
||||
"answer": "The payment term with CotonBio Europe is 45 days. This extended credit period aids TShirtCo's cash flow when purchasing cotton from Poland and Hungary, a practice supporting its ethical supply chain as of November 2025.",
|
||||
"expected_answer": "45 jours"
|
||||
},
|
||||
{
|
||||
"question": "What is the delivery deadline for invoice FAC-2025-042?",
|
||||
"answer": "The delivery deadline for invoice FAC-2025-042 is 30 days net from March 14, 2025, meaning payment was due by April 13, 2025. This standard term was met by Léa Moreau, as the order is recorded as delivered by late 2025.",
|
||||
"expected_answer": "30 days"
|
||||
},
|
||||
{
|
||||
"question": "How many violet T-shirts with Motif7 were ordered in CMD-2025-005?",
|
||||
"answer": "A total of 30 violet T-shirts with Motif7 were ordered in CMD-2025-005. Placed by Collectif ArtVif on March 28, 2025, this bulk order with a custom design is planned for production, reflecting TShirtCo's versatility.",
|
||||
"expected_answer": "30"
|
||||
},
|
||||
{
|
||||
"question": "What is the amount of debt to suppliers in 2024?",
|
||||
"answer": "The amount of debt to suppliers in 2024 is 28,920 €. This liability, owed to partners like TShirtFactory, is managed as part of TShirtCo's cash flow strategy, recorded in the year-end balance of December 31, 2024.",
|
||||
"expected_answer": "28 920 €"
|
||||
},
|
||||
{
|
||||
"question": "What is the total asset value for TShirtCo in 2024?",
|
||||
"answer": "The total asset value for TShirtCo in 2024 is 126,320 €. This includes 28,500 € in immobilizations, 42,300 € in stock, and 36,800 € in liquid assets, providing a comprehensive financial overview as of year-end.",
|
||||
"expected_answer": "126 320 €"
|
||||
},
|
||||
{
|
||||
"question": "What is the position of Sarah Klein?",
|
||||
"answer": "Sarah Klein holds the position of Commerciale at TShirtCo. Since November 12, 2023, she manages sales and client relationships, earning 2,600 € plus primes, a role vital to the company's market expansion as of 2025.",
|
||||
"expected_answer": "Commerciale"
|
||||
},
|
||||
{
|
||||
"question": "What is the client name for invoice FAC-2025-042?",
|
||||
"answer": "The client name for invoice FAC-2025-042 is Léa Moreau. Residing at 45 Avenue des Lilas, 69000 Lyon, she received this invoice on March 14, 2025, for a mix of t-shirts totaling 81.12 € TTC.",
|
||||
"expected_answer": "Léa Moreau"
|
||||
},
|
||||
{
|
||||
"question": "What is the stock of rose T-shirt with Motif9 in size XL?",
|
||||
"answer": "The stock of rose T-shirts with Motif9 in size XL is 33 units. This portion of the 144-unit total, with Austrian cotton, indicates a well-distributed inventory, supporting demand as of November 13, 2025.",
|
||||
"expected_answer": "33"
|
||||
},
|
||||
{
|
||||
"question": "What is the date of the last order in the history?",
|
||||
"answer": "The date of the last order in the history is March 28, 2025, for CMD-2025-005 by Collectif ArtVif. This recent order, including 30 violet T-shirts, is planned for production, active as of today.",
|
||||
"expected_answer": "28/03/2025"
|
||||
},
|
||||
{
|
||||
"question": "What is the salary of the graphic design intern?",
|
||||
"answer": "The salary of the graphic design intern, Tom Lemoine, is 800 € gross per month. This stipend, effective since September 1, 2024, supports his training, aligning with TShirtCo's talent development as of late 2025.",
|
||||
"expected_answer": "800 €"
|
||||
},
|
||||
{
|
||||
"question": "What is the amount of bank funds in 2024?",
|
||||
"answer": "The amount of bank funds in 2024 is 35,600 €. This liquid asset, part of the 126,320 € total assets, provides flexibility for operations and debt management as recorded on December 31, 2024.",
|
||||
"expected_answer": "35 600 €"
|
||||
},
|
||||
{
|
||||
"question": "What is the price of a white T-shirt with Motif3 in size M?",
|
||||
"answer": "The price of a white T-shirt with Motif3 in size M is 19.3 €. This covers Spanish cotton and French manufacturing, offering a competitive rate for a quality, ethically produced item as of November 2025.",
|
||||
"expected_answer": "19.3 €"
|
||||
},
|
||||
{
|
||||
"question": "What is the status of order CMD-2025-004?",
|
||||
"answer": "The status of order CMD-2025-004 is 'Livrée' (Delivered). Placed by Léa Moreau on March 14, 2025, and shipped by March 29, 2025, this order for 3 t-shirts was completed, recorded as of today.",
|
||||
"expected_answer": "Livrée"
|
||||
},
|
||||
{
|
||||
"question": "What is the total stock of yellow T-shirts across all motifs?",
|
||||
"answer": "The total stock of yellow T-shirts across all motifs exceeds 1,000 units when aggregated, with examples like 95 for Motif3, 106 for Motif1, and 141 for Motif9, reflecting a diverse and robust inventory as of November 2025.",
|
||||
"expected_answer": ">1000 (approximate)"
|
||||
},
|
||||
{
|
||||
"question": "How many items are planned for production on 26/03/2025?",
|
||||
"answer": "A total of 20 items are planned for production on March 26, 2025. These orange T-shirts with Motif1 for CMD-2025-005, overseen by Ahmed Benali, highlight TShirtCo's production capacity as scheduled.",
|
||||
"expected_answer": "20"
|
||||
},
|
||||
{
|
||||
"question": "What is the social security number of Léa Garnier?",
|
||||
"answer": "The social security number of Léa Garnier is 2 98 09 75 321 098 76. Assigned since her hiring on September 3, 2022, this identifier supports payroll and benefits compliance as of November 2025.",
|
||||
"expected_answer": "2 98 09 75 321 098 76"
|
||||
},
|
||||
{
|
||||
"question": "If a supplier delays cotton delivery by 30 days, how might TShirtCo adjust production?",
|
||||
"answer": "A 30-day delay in cotton delivery, affecting new orders beyond March 2025, might halt production of affected motifs. TShirtCo could prioritize existing stock (e.g., 148 red Motif1 units), expedite alternative suppliers like CotonBio Europe, or inform clients of delays, ensuring minimal disruption as of November 2025.",
|
||||
"expected_answer": "Prioritize stock, expedite suppliers, inform clients"
|
||||
},
|
||||
{
|
||||
"question": "What is the average grammage of all listed T-shirts?",
|
||||
"answer": "The average grammage of all listed T-shirts, based on samples like 182 g/m² (blue Motif5), 181 g/m² (blue Motif2), and 190 g/m² (gray Motif5), approximates 182-184 g/m² across the 100 models. This reflects a consistent medium-weight fabric standard as of late 2025.",
|
||||
"expected_answer": "Approx. 183 g/m²"
|
||||
},
|
||||
{
|
||||
"question": "What is the total debt in 2024?",
|
||||
"answer": "The total debt in 2024 is 38,920 €, comprising 28,920 € to suppliers, 8,000 € in social/fiscal debts, and a 2,000 € bank loan remainder, managed effectively in the year-end balance as of December 31, 2024.",
|
||||
"expected_answer": "38 920 €"
|
||||
},
|
||||
{
|
||||
"question": "How many employees are on full-time contracts?",
|
||||
"answer": "There are 7 employees on full-time contracts at TShirtCo, including John Doe, Claire Moreau, and Ahmed Benali, ensuring a dedicated workforce as of November 13, 2025, with one intern and one apprentice part-time.",
|
||||
"expected_answer": "7"
|
||||
}
|
||||
]
|
||||
13
main.py
13
main.py
@ -1,24 +1,27 @@
|
||||
import os
|
||||
import glob
|
||||
from typing import List
|
||||
import json
|
||||
|
||||
from src.impl.datastore import Datastore, DataItem
|
||||
from src.impl.indexer import Indexer
|
||||
from src.impl.retriever import Retriever
|
||||
from src.impl.response_generator import ResponseGenerator
|
||||
from src.impl.evaluator import Evaluator
|
||||
|
||||
from src.RAG_pipeline import RAGpipeline
|
||||
from create_parser import create_parser
|
||||
|
||||
DEFAULT_SOURCE_PATH = "data/source/"
|
||||
DEFAULT_EVAL_PATH =""
|
||||
DEFAULT_EVAL_PATH ="data/eval/sample_questions.json"
|
||||
|
||||
def create_pipeline() -> RAGpipeline:
|
||||
indexer = Indexer()
|
||||
datastore = Datastore()
|
||||
retriever = Retriever(datastore= datastore)
|
||||
response_generator = ResponseGenerator()
|
||||
return RAGpipeline(indexer = indexer, datastore = datastore, retriever= retriever, response_generator= response_generator)
|
||||
evaluator = Evaluator()
|
||||
return RAGpipeline(indexer = indexer, datastore = datastore, retriever= retriever, response_generator= response_generator, evaluator= evaluator)
|
||||
|
||||
def main():
|
||||
|
||||
@ -29,14 +32,14 @@ def main():
|
||||
source_path = getattr(args, "path", DEFAULT_SOURCE_PATH) or DEFAULT_SOURCE_PATH
|
||||
documents_path = get_files_in_directory(source_path=source_path)
|
||||
|
||||
#eval_path = args.eval_file if args.eval_file else DEFAULT_EVAL_PATH
|
||||
#sample_questions = json.load(open(eval_path, "r"))
|
||||
eval_path = args.eval_file if args.eval_file else DEFAULT_EVAL_PATH
|
||||
sample_questions = json.load(open(eval_path, "r"))
|
||||
|
||||
commands = {
|
||||
"run": lambda: pipeline.run(documents_path = documents_path),
|
||||
"reset": lambda: pipeline.reset(),
|
||||
"add": lambda: pipeline.add_documents(documents_path=documents_path),
|
||||
"evaluate": lambda: pipeline.evaluate(),
|
||||
"evaluate": lambda: pipeline.evaluate(sample_questions= sample_questions),
|
||||
"query": lambda: print(pipeline.process_query(args.prompt)),
|
||||
}
|
||||
|
||||
|
||||
@ -1,7 +1,8 @@
|
||||
from typing import List, Optional, Dict
|
||||
from concurrent.futures import ThreadPoolExecutor
|
||||
from dataclasses import dataclass
|
||||
from .interface.base_datastore import BaseDatastore
|
||||
from .interface.base_evaluator import BaseEvaluator
|
||||
from .interface.base_evaluator import BaseEvaluator, EvaluationResult
|
||||
from .interface.base_indexer import BaseIndexer
|
||||
from .interface.base_response_generator import BaseResponseGenerator
|
||||
from .interface.base_retriever import BaseRetriever
|
||||
@ -27,11 +28,44 @@ class RAGpipeline:
|
||||
|
||||
return
|
||||
|
||||
def evaluate(self, arg1 = None) -> None:
|
||||
def evaluate(self, sample_questions: List[Dict[str, str]]) -> List[EvaluationResult]:
|
||||
"""A compléter lors de l'ajout de l'évaluateur
|
||||
"""
|
||||
print("Function not completed.")
|
||||
return
|
||||
questions = [ item["question"] for item in sample_questions]
|
||||
expected_answers = [ item["answer"] for item in sample_questions]
|
||||
|
||||
with ThreadPoolExecutor(max_workers=5) as executor:
|
||||
results: List[EvaluationResult] = list(
|
||||
executor.map(
|
||||
self.evaluate_single_question,
|
||||
questions,
|
||||
expected_answers,
|
||||
)
|
||||
)
|
||||
|
||||
for i, result in enumerate(results):
|
||||
result_emoji = "✅" if result.is_correct else "❌"
|
||||
print(f"{result_emoji} Q {i+1}: {result.question}: \n")
|
||||
print(f"Response: {result.response}\n")
|
||||
print(f"Expected Answer: {result.expected_answer}\n")
|
||||
print(f"Reasoning: {result.reasoning}\n")
|
||||
print("--------------------------------")
|
||||
|
||||
number_correct= sum(result.is_correct for result in results)
|
||||
print(f"✨ Total Score: {number_correct}/{len(results)}")
|
||||
return results
|
||||
|
||||
def evaluate_single_question(self, question: str, expected_answer: str):
|
||||
"""Appelle le générateur de réponse sur la question pour comparer le résultat avec
|
||||
la réponse attendue.
|
||||
|
||||
Args:
|
||||
question (str): Question de l'évaluateur, à poser au modèle
|
||||
expected_answer (str): Réponse de l'évaluateur
|
||||
"""
|
||||
model_response = self.process_query(query=question)
|
||||
return self.evaluator.evaluate(query= question, response= model_response, expected_answer= expected_answer)
|
||||
|
||||
|
||||
def run(self, documents_path: List[str], arg2 = None) -> None:
|
||||
self.reset()
|
||||
|
||||
@ -1,11 +1,26 @@
|
||||
from ..interface.base_evaluator import EvaluationResult, BaseEvaluator
|
||||
import requests
|
||||
|
||||
from dotenv import load_dotenv
|
||||
from groq import Groq
|
||||
from typing import Optional
|
||||
import os
|
||||
import re
|
||||
class Evaluator(BaseEvaluator):
|
||||
|
||||
def __init__(self, model_name: str = "llama3.2:8b", base_url: str = "http://localhost:11434"):
|
||||
self.base_url = base_url
|
||||
self.model_name = model_name
|
||||
def __init__(self, api_key: Optional[str] = None):
|
||||
try :
|
||||
load_dotenv()
|
||||
self.api_key = api_key or os.getenv("GROQ_API_KEY")
|
||||
print("Clé API récupérée avec succès \n")
|
||||
except Exception as e:
|
||||
raise ValueError(f"erreur avec la clé API: {e}")
|
||||
|
||||
try:
|
||||
self.client = Groq(api_key=self.api_key)
|
||||
self.model = "llama-3.1-8b-instant" # Rapide et gratuit
|
||||
print("✅ Générateur Groq initialisé avec succès")
|
||||
except Exception as e:
|
||||
raise ValueError(f"❌ Erreur lors de l'initialisation de Groq: {e}")
|
||||
|
||||
|
||||
SYSTEM_PROMPT = """
|
||||
@ -28,18 +43,58 @@ Then return the result in <result>...</result> tags — either as 'true' or 'fal
|
||||
<expected_answer> \n{expected_answer} </expected_answer>
|
||||
"""
|
||||
|
||||
response_content = requests.post(
|
||||
f"{self.base_url}/api/generate",
|
||||
json={
|
||||
"model": self.model_name,
|
||||
"prompt": user_prompt,
|
||||
"stream": False,
|
||||
"options": {
|
||||
"temperature": 0.7,
|
||||
"top_p": 0.9,
|
||||
}
|
||||
try:
|
||||
chat_completion = self.client.chat.completions.create(
|
||||
messages = [
|
||||
{
|
||||
"role": "system",
|
||||
"content": self.SYSTEM_PROMPT
|
||||
},
|
||||
{
|
||||
"role": "user",
|
||||
"content": user_prompt
|
||||
}
|
||||
],
|
||||
model=self.model,
|
||||
temperature=0.3,
|
||||
max_tokens=1000,
|
||||
)
|
||||
|
||||
return super().evaluate(query, response, expected_answer)
|
||||
response_text = chat_completion.choices[0].message.content
|
||||
|
||||
reasoning_match = re.search(r'<reasoning>(.*?)</reasoning>', response_text, re.DOTALL)
|
||||
result_match = re.search(r'<result>(.*?)</result>', response_text, re.DOTALL)
|
||||
|
||||
reasoning = reasoning_match.group(1).strip() if reasoning_match else "Pas de raisonnement fourni"
|
||||
result_str = result_match.group(1).strip().lower() if result_match else "false"
|
||||
|
||||
is_correct = result_str == "true"
|
||||
|
||||
return EvaluationResult(
|
||||
question=query,
|
||||
response= response,
|
||||
expected_answer=expected_answer,
|
||||
is_correct= is_correct,
|
||||
reasoning= reasoning
|
||||
)
|
||||
|
||||
except Exception as e:
|
||||
error_msg = str(e).lower()
|
||||
|
||||
# Gestion des erreurs spécifiques
|
||||
if "rate" in error_msg or "limit" in error_msg:
|
||||
error_text = "⚠️ Limite de requêtes atteinte. Attendez et réessayez."
|
||||
elif "authentication" in error_msg or "api" in error_msg or "401" in error_msg:
|
||||
error_text = "❌ Erreur d'authentification. Vérifiez votre clé API dans le fichier .env"
|
||||
else:
|
||||
error_text = f"❌ Erreur lors de l'évaluation: {str(e)}"
|
||||
|
||||
return EvaluationResult(
|
||||
question=query,
|
||||
response= response,
|
||||
expected_answer=expected_answer,
|
||||
is_correct=False,
|
||||
reasoning=error_text
|
||||
)
|
||||
|
||||
|
||||
Reference in New Issue
Block a user