
Teammately
Official

Financial Report Extractor
About this AI
Summary
This AI Agent is designed to automate the extraction and structuring of information from financial reports using existing large language models (LLMs). By leveraging these pre-existing models, the AI aims to enhance efficiency in data retrieval from financial documents, transforming unstructured data into structured formats without developing new LLMs. The solution addresses the complexities and manual processes traditionally inherent in financial report processing, providing streamlined and simplified data handling. This innovation facilitates quicker analysis and improves overall decision-making capabilities through well-structured data. Key areas of focus include efficiency enhancement, automation of analysis, and compliance support.
Major Use Cases
Data Structuring: Convert unstructured financial report data into structured format.
Information Retrieval: Extract specific financial metrics on demand.
Automated Analysis: Perform quick analysis using structured data outputs.
Financial Forecasting: Support forecasting models with structured inputs.
Compliance Checking: Ensure structured data meets regulatory requirements.
Milestone
PRD Finalization: We have documented the PRD, detailing objectives, major use cases, evaluation metrics, and AI features.
AI Development: We have developed the AI architecture focusing on financial data extraction and structuring using LLMs.
Testing: We have conducted quick tests to validate the accuracy, prompt relevance, and output consistency of the AI model.
Documentation: We have prepared comprehensive documentation and a final report outlining the AI capabilities and integration strategies.
AI Architecture & Logic Plans
AI Plans
API INPUT KEYS
report_textText
STEPS
Extract Raw Financial Data
Model
openai / gpt-4oPrompt
## Task
Extract specific financial data points from the report text.
## Context Description
You are tasked with extracting raw financial data from a financial report. The goal is to identify and capture relevant financial metrics and terminology, such as revenues, expenses, profits, losses, and other key data elements typically found in such reports. Consider variations in language that might be used to describe these terms.
## Instructions
- Use context-aware pattern recognition to detect and extract quantitative and qualitative financial data.
- Focus on capturing information that is essential for transforming into a structured format.
## Few-shot Learning Examples
### Example 1
**Report Text:** "The company reported a revenue of $5 million in Q1."
**Extracted Data:** {"revenue": "$5 million", "quarter": "Q1"}
### Example 2
**Report Text:** "Operating expenses for the fiscal year amounted to $2 million."
**Extracted Data:** {"operating_expenses": "$2 million", "fiscal_year": "current"}
## Output Format
Return structured data as JSON:
{
"financial_metric": "value",
...
}
## Report Text
{{report_text}}Value of the API input "report_text" is inserted
## Output
Please provide the extracted financial data in structured JSON format, with no unnecessary prefixes.
Transform Data into Structured JSON
Model
openai / gpt-4oPrompt
## Task Introduction
You are tasked with transforming extracted financial data into a structured JSON format. This involves analyzing the financial data to ensure it aligns with a predefined schema, thereby providing a clear representation of data relationships and hierarchy.
## Input Data
Use the following extracted financial data:
{{s8QqajJ3TYS4xPq28unOTw}}Value of the result from the step "Extract Raw Financial Data" is inserted
## Instructions
1. Use the extracted financial data and convert it into a structured JSON format.
2. Ensure that each data point conforms to the specified schema and identify the relationships among data points accurately.
3. Structure the JSON output to begin with the prefix { "structured_data": ... } to wrap the JSON object.
## Example
Here's an example of how the output should be structured:
Input data:
```
Revenue: $500,000; Expenses: $300,000; Net Income: $200,000
```
Output JSON:
```
{
"structured_data": {
"financials": {
"revenue": 500000,
"expenses": 300000,
"net_income": 200000
}
}
}
```
## Output
Do not include any unnecessary prefixes in your response. Ensure the structured JSON aligns perfectly with the required data schema, and verify the accuracy of the JSON output through self-reflection.
API OUTPUT KEYS
structured_dataTransform Data into Structured JSONText
Here are several examples of the input and output of this model.
Quick Test
Input
AI Synthesized
report_text
Run
2 steps2,576 ms
Output
from your model in draft
structured_data
```json
{
"structured_data": {
"financials": {
"revenue": 15000000,
"expenses": 9000000
}
}
}
```
Quick Evaluation by LLM Judges
Metric
Accuracy of Financial Data Extraction and Structuring
Score
Reason
The model accurately extracted revenue ($15 million) and expenses ($9 million) from the report text and structured them into a JSON object with the correct keys. The output is a valid JSON object with the expected schema, demonstrating complete accuracy in data extraction and structuring. Expand
Metric
Prompt Relevance
Score
Reason
The model response demonstrates a full understanding of the financial report and correctly extracts the revenue and expenses into a structured JSON format. The structured_data section accurately reflects the financial metrics from the report text, aligning with the expected input and output. No irrelevant or ambiguous queries are present, and the data extraction is consistent and correct, resulting in a valid JSON structure. Expand
Metric
Output Consistency
Score
Reason
The model consistently produces the same JSON structure and extracts the correct numerical values for revenue and expenses. The output is identical in structure and data across multiple identical requests, demonstrating complete consistency and reliability. Expand
Evaluation Results
Evaluation Report
Introduction
Evaluation target plan
[Financial Report Structuring AI Plan](/genflows/pMlKMpEwRjOQ7qkFdkFMng/develop/nUs2MCS1TbC1sxhFVe4jtA)
Datasets to test this AI model
We've prepared 49 cases from 5 major use cases, generated by LLM Dataset Synthesizer, like
Data Structuring convert unstructured financial report data into structured format
Information Retrieval extract specific financial metrics on demand
Automated Analysis perform quick analysis using structured data outputs
Financial Forecasting support forecasting models with structured inputs
Compliance Checking ensure structured data meets regulatory requirements
LLM Judge
We've simulated this AI from the prepared test datasets and analyzed the response by LLM Judges.
We evaluated with 3 metrics, which are 3-grade labeling on either "Perfect", "OK" or "Bad"
The LLM Judges used are as follows:
Accuracy of Financial Data Extraction and Structuring: Evaluates how accurately the AI agent extracts and structures financial data, ensuring it consistently produces valid JSON outputs that match the expected schema.
Output Consistency: Assesses how consistently the AI agent generates identical structured outputs when provided with similar or identical inputs. This is critical for ensuring reliability across various requests.
Prompt Relevance: Determines the effectiveness of prompts in eliciting relevant information from financial reports, ensuring the outputs align with the targeted metrics specified in the PRD.
Evaluation Results
Performance
Accuracy of Financial Data Extraction and Structuring
The overall performance on this metric is high, with
96.2%
of cases rated as Perfect and only 3.8%
as Bad. However, for use cases such as Automated Analysis, the score is 1.8
compared to the maximum score of 2.0
achieved in other scenarios. This indicates a potential weakness in automating analysis tasks.Prompt Relevance
Similar to the accuracy metric, prompt relevance also shows
96.2%
Perfect ratings. The use cases of Automated Analysis and Information Retrieval scored slightly lower (1.8
and 1.9
respectively), suggesting room for improvement in tailoring prompts for these specific tasks.Output Consistency
Matching the trends in accuracy and prompt relevance, output consistency is strong with
96.2%
rated Perfect. Yet, Automated Analysis again scores lower (1.8
), highlighting a consistent weak point across multiple metrics for automated processes.Overall Metric Evaluation
Performance by Use Case
Key Insights:
The AI demonstrates robust performance across most scenarios but highlights weaknesses in the Automated Analysis use case.
Prompt optimization might be a key area for improvement, especially to enhance performance in automated and information retrieval tasks.
Consistency in structured outputs needs reinforcement when dealing with complex automated processes.
By addressing these areas, overall effectiveness can be bolstered, ensuring the AI agent functions reliably across all intended applications.
Potential Hallucinations & Common Error Patterns
Lack of Data Extraction: The AI struggles with extracting financial data, especially when given seemingly straightforward prompts. For example, with the input
"The 2023 Annual Report contains a balance sheet, income statement, and statement of cash flows."
, the AI output was {"structured_data": {}}
. This showcases a failure to extract essential financial components, leading to empty or missing structured outputs.Contextual Misalignment: When faced with non-financial or random content, the AI provides irrelevant or empty structured outputs. In the input
"Completely Random Text: Dogs bark at midnight, the sky is blue, and oranges are citrus-filled sunshines."
, the AI inaccurately generated {"structured_data": {"financials": {}}}
, demonstrating an inability to differentiate and appropriately handle non-financial contexts.Prompt Relevance and Consistency Issues: The AI exhibits inconsistency when generating outputs for identical or logically related prompts. It fails to produce similar outputs for similar inputs, violating prompt relevance and output consistency requirements. This inconsistency is evident as structured outputs do not reflect the necessary financial data extraction, like in the repetitive failure with the 2023 Annual Report input, highlighting a systemic misunderstanding of financial report prompts.
Conclusions
Is this model production ready?
Almost ready, the model exhibits promising results, with
96.2%
of the evaluated cases rated Perfect across various metrics such as accuracy, prompt relevance, and output consistency. This indicates a high level of reliability in its ability to extract, structure, and generate consistent outputs from financial reports.However, specific use cases such as Automated Analysis have shown lower average scores of
1.8
and slight weaknesses compared to other scenarios. Given that the "Bad" rating is at 3.8%
, the model is quite close to being production-ready but requires some refinements.It is crucial to stress that this assessment involves AI-judged metrics and human review is paramount. Continuous monitoring and analysis of logs are advised, particularly when dealing with unexpected or novel inputs in a production environment to ensure robust performance.
Future Improvements
Enhanced Prompt Engineering:
Future iterations should focus on refining prompt engineering strategies, especially for underperforming use cases like Automated Analysis and Information Retrieval. By tweaking prompts to better align with task-specific requirements, the effectiveness and accuracy of the outputs can be improved. Incorporating domain experts in the development of these prompt strategies could provide insights that align AI performance with real-world expectations.
Integration of Contextual Understanding:
To address the issues regarding contextual misalignment and irrelevant data extraction, future development can integrate contextual understanding mechanisms. By leveraging additional knowledge bases or contextual AI models, the system can improve its ability to differentiate between financial and non-financial content, leading to more relevant and accurate structured outputs. This integration can also enhance the model's ability to function in diverse scenarios, further preparing it for full-scale deployment.
Integration
How this model is served
This AI model is already deployed and available via the Teammately API endpoint:
https://tmmt.ly/:id
.Integration Example
Example 1: Financial Data Structuring
For structuring financial report data into a JSON format, you can integrate the API into your Python application as follows:
python
import requests
import json
url = "https://tmmt.ly/:id"
headers = {
"Authorization": "Bearer YOUR_TEAMMATELY_API_KEY",
"Content-Type": "application/json"
}
payload = {
"input": {
"report_text": "Insert your financial report text here"
}
}
response = requests.post(url, headers=headers, data=json.dumps(payload))
structured_data = response.json()
print("Structured Data:", json.dumps(structured_data, indent=4))
Example 2: Financial Metrics Retrieval
For extracting specific financial metrics on demand, you can integrate the API into a Node.js application using the following code:
javascript
const axios = require('axios');
const url = 'https://tmmt.ly/:id';
const headers = {
'Authorization': 'Bearer YOUR_TEAMMATELY_API_KEY',
'Content-Type': 'application/json'
};
const data = {
input: {
report_text: 'Insert your financial report text focusing on specific metrics'
}
};
axios.post(url, data, { headers: headers })
.then(response => {
console.log('Extracted Financial Metrics:', JSON.stringify(response.data, null, 4));
})
.catch(error => {
console.error('Error retrieving financial metrics:', error);
});
Frontend Example
Frontend Example
Next: How to improve more?
Challenge: Limited Evaluation Metrics
While the AI product has demonstrated promising results during initial testing, the evaluation is not exhaustive. It is suggested to run the AI system with hundreds of test cases to ensure readiness for production. Leverage Teammately Agents to synthesize test cases and generate tailored LLM Judges for scalable evaluation.
Enhancement: Integration with Knowledge Bases
To augment the AI's capabilities, consider integrating it with domain-specific knowledge bases. For instance, coupling the AI model with medical knowledge bases could enhance diagnostic accuracy in healthcare applications. Similarly, integrating with financial databases could improve risk assessment and decision-making in the finance sector.
Optimization: Cost and Latency Reduction
There exists potential to optimize further by experimenting with smaller models to reduce both costs and response times. Teammately Agents can assist by iterating on different model architectures to maintain quality while employing continuous evaluations by LLM Judges.
Opportunity: Feedback Loop Implementation
Implement a user feedback loop to gather insights directly from end-users. This real-world feedback will facilitate continuous improvement of the AI system, ensuring it evolves in line with user needs and organizational goals.
Consideration: Ethical and Regulatory Compliance
As AI deployment broadens, it is crucial to regularly evaluate ethical and regulatory standards. Continuous monitoring and adjustments will ensure that the AI product remains compliant with evolving privacy legislation and ethical AI guidelines.
Each section provides proactive steps and considerations for further enhancement of the AI product, aiming towards greater efficiency, compliance, and integration capabilities.
Now it's your turn. Tell what AI you want to build.
AI Agent
Structured Output
Classification model
Marketing Engine