
Teammately
Official
Fashion Recommendation
About this AI
Summary
This project aims to create a sophisticated AI Agent-Based Fashion Recommendation system for an online fashion brand. Leveraging large language models (LLMs), it focuses on personalizing user experiences by recommending fashion items tailored to individual preferences and historical interactions. The AI utilizes advanced prompt engineering techniques and emphasizes robust validation methods to ensure the accuracy and relevance of its recommendations. The system is intended to serve a wide user base through an API, providing tailored fashion advice by integrating seamlessly with other platforms.
Major Use Cases
Personalized Rec: Customize recommendations based on user preferences and history.
Trend-Spotting: Identify and suggest trending fashion items.
Event-Specific: Curate recommendations for special occasions or events.
Cross-Sell: Suggest complementary items to cart contents.
Feedback Loop: Enhance recommendations with user ratings and comments.
Milestone
AI Development: We have successfully built the AI architecture and logic, focusing on user profiling and query analysis to generate personalized fashion suggestions.
Testing: We have conducted quick tests to ensure functionality and relevance, achieving high relevance scores.
Integration: We have compiled the AI into an API, enabling integration with external systems for widespread accessibility.
AI Architecture & Logic Plans
AI Plans
API INPUT KEYS
user_queryText
interaction_historyText
STEPS
User Profiling & Query Analysis
Model
openai / gpt-4oPrompt
## User Profiling & Query Analysis
### Role: AI Assistant
You are an AI assistant that specializes in understanding user preferences and providing fashion recommendations. Your task is to analyze user input consisting of a `user_query` and `interaction_history` to create a detailed and personalized user profile.
## User Query Analysis
- Analyze the current user query: {{user_query}}Value of the API input "user_query" is inserted
- Identify key fashion preferences and style inclinations.
## Interaction History Assessment
- Evaluate past interactions for insights: {{interaction_history}}Value of the API input "interaction_history" is inserted
- Highlight recurring themes, preferences, and any notable patterns in fashion choices.
## Profile Construction
From your analyses, generate a refined user profile reflecting the insights discovered:
- **fashion_preferences**: Summarize key fashion items or styles that the user favors.
- **style_inclinations**: Describe the typical style preferences, such as casual, formal, etc.
- **contextual_data**: Note any contextual factors like seasonality, event-based preferences, or recent trends.
### Example:
{
"fashion_preferences": ["floral tops", "denim skirts"],
"style_inclinations": "casual, summer styles",
"contextual_data": "seasonal trends favor light, airy fabrics"
}
## Output
Generate a JSON object with the fields mentioned above, reflecting the user's refined profile.
Recommendation Generation
Model
openai / gpt-4oPrompt
## Context Setup
You are a specialized recommendation engine designed to provide tailored fashion product suggestions to users based on their preferences and past interactions. Your goal is to maximize relevance and accuracy while minimizing computational resources.
## User Information
The following is the refined user profile containing analyzed preferences and context, derived from specific user interactions and queries:
{{l484ZnoSS2mMNqPsdjhPcw}}Value of the result from the step "User Profiling & Query Analysis" is inserted
## Task
Using the provided user profile, generate a list of fashion product recommendations. Assess each product's relevance and assign a recommendation score based on how well it aligns with the user's profile. Your aim is to craft precise, impactful recommendations that cater to the specific tastes and interests of the user.
## Output Format
Your response should be a structured JSON object including the following fields:
- `product_name`: The name of the recommended fashion item.
- `product_url`: A URL link to the product page.
- `recommendation_score`: A numerical value representing the suitability of the item to the user's preferences.
### Example Output
{
"products": [
{
"product_name": "Blue Floral Summer Dress",
"product_url": "http://example.com/blue-dress",
"recommendation_score": 0.95
},
{
"product_name": "Casual Denim Jacket",
"product_url": "http://example.com/denim-jacket",
"recommendation_score": 0.89
}
]
}
## Review and Refine
Before providing the final recommendations, conduct a self-reflection step to evaluate and enhance the relevance of your initial suggestions based on user criteria. Adjust the recommendation scores accordingly to ensure user satisfaction and adherence to preferred styles.
Response:
API OUTPUT KEYS
productsRecommendation GenerationText
Here are several examples of the input and output of this model.
Quick Test
Input
AI Synthesized
interaction_history
user_query
Run
2 steps11,580 ms
Output
from your model in draft
products
```json
{
"products": [
{
"product_name": "Bright Yellow Floral Evening Dress",
"product_url": "http://example.com/bright-yellow-floral-evening-dress",
"recommendation_score": 0.98
},
{
"product_name": "Coral Pink Summer Dress",
"product_url": "http://example.com/coral-pink-summer-dress",
"recommendation_score": 0.96
},
{
"product_name": "Turquoise Floral Top",
"product_url": "http://example.com/turquoise-floral-top",
"recommendation_score": 0.94
},
{
"product_name": "Vibrant Orange Evening Gown",
"product_url": "http://example.com/vibrant-orange-evening-gown",
"recommendation_score": 0.93
},
{
"product_name": "Bright Multicolor Maxi Dress",
"product_url": "http://example.com/bright-multicolor-maxi-dress",
"recommendation_score": 0.92
}
]
}
```
### Review and Refinement
- **Relevance Check**: Each product aligns with the user's preference for evening dresses, floral designs, and vibrant colors, suitable for events such as weddings.
- **Score Adjustment**: Recommendation scores were weighed based on the user's preference for warm weather suitability and affordability. The chosen products reflect these priorities while maintaining a stylish and formal appearance.
- **Final Verification**: Ensured each product is accompanied by a working URL link to facilitate immediate exploration and purchase by the user.
Quick Evaluation by LLM Judges
Metric
Relevance Score
Score
Reason
The model's response demonstrates a high degree of relevance to the user's preferences. The suggested products ('Bright Yellow Floral Evening Dress', 'Coral Pink Summer Dress', etc.) directly address the user's request for a stylish, affordable evening dress for a wedding. The inclusion of "floral" and "bright colors" aligns with the user's prior browsing history. The recommendation scores further suggest a prioritization of relevant items. The presence of working URLs for each product also satisfies the requirement for facilitating immediate exploration and purchase. All must-have elements are present, and no must-not-have elements are identified. Expand
Metric
Engagement Rate
Score
Reason
The model response provides a list of products with associated URLs and scores, aligning with the expected JSON structure. The recommendations are relevant to the user's query and history (e.g., evening dresses, floral designs, bright colors). However, there is no indication of user interaction metrics (click-through rates, cart additions, purchases). The lack of these metrics prevents a grade of 2, as substantial user engagement is not demonstrated. Expand
Metric
Trend Accuracy
Score
Reason
The model response demonstrates a strong correlation with current fashion trends. The suggested products (e.g., "Bright Yellow Floral Evening Dress") align with the user's preferences for floral designs and bright colors, which are frequently seen in current fashion trends. The inclusion of "evening gown" and "maxi dress" styles further suggests an understanding of current evening wear trends. The model also provides working URLs, fulfilling the requirement for actionable recommendations. All required elements are present and align with the trend-spotting use case. Expand
Evaluation Results
Evaluation Report
Introduction
Evaluation target plan
[PersonalizedAI_FashionRecommender](/genflows/35kDk5EcRpKN6tZNwp88Lg/develop/hq0WAZWDRo6Jis3f-Bc8nA)
Datasets to test this AI model
We've prepared 50 cases from 5 major use cases, generated by LLM Dataset Synthesizer, including:
Personalized Rec: Customize recommendations based on user preferences and history.
Trend-Spotting: Identify and suggest trending fashion items.
Event-Specific: Curate recommendations for special occasions or events.
Cross-Sell: Suggest complementary items to cart contents.
Feedback Loop: Enhance recommendations with user ratings and comments.
LLM Judge
We've simulated this AI from the prepared test datasets and analyzed the response by LLM Judges. We evaluated with 3 metrics, which are 3-grade labeling on either "Perfect", "OK" or "Bad."
The LLM Judges used are as follows:
Engagement Rate: Measures how effectively the AI recommendations drive user interactions and satisfaction.
Relevance Score: Assesses how well the suggestions match user preferences and previous interactions.
Trend Accuracy: Evaluates the AI's capability to accurately predict and recommend trending fashion items.
Evaluation Results
Performance
The AI system's Relevance Score shows a dominant
84.6% in the Perfect category, but only 15.4% in OK, indicating strong performance in delivering relevant recommendations consistently.Engagement Rate sees
67.3% in OK with 32.7% in Perfect, suggesting room for improvement in engaging users more effectively.Overall Performance by Metric
For Trend Accuracy,
46.2% is OK, and 53.8% is Perfect, reflecting moderate capability in identifying and suggesting trending items.Use Case Analysis
Personalized Rec and Trend-Spotting consistently score
1.8 to 2.0 across metrics, indicating robust ability to customize and identify trends ⛅️.Use Case Performance
Cross-Sell gets a perfect
2.0 in Relevance Score, but lower 1.3 in Engagement Rate, revealing potential in improving engagement strategies.Feedback Loop experiences less effectiveness, especially in Engagement Rate at
1.2, highlighting a need for refinement in user interaction methods.Event-Specific use cases perform adequately with
1.9 in Relevance Score but only 1.3 in Engagement Rate, hinting at opportunities to better target special occurrences.Key Recommendations
Focus on enhancing Engagement Rate across all use cases, particularly within the Feedback Loop and Event-Specific categories.
Improve mechanisms to transition OK scores into Perfect, particularly in Trend Accuracy to maximize AI's potential in identifying trends.
Consider refining strategies for Cross-Sell engagement to fully capitalize on its high relevance capability.
These insights reveal a structured pathway to optimize the AI's performance, consistently aligning with user needs and preferences. 🌟
Potential Hallucinations & Common Error Patterns
Misalignment with User Preferences: In several cases, the AI recommendations partially aligned with user preferences. For example, in the
athletic wear query, items such as a T-shirt were recommended alongside yoga pants and running shoes despite the explicit preferences for the latter. Similarly, in the hiking gear for outdoor adventure scenario, a thermal long sleeve shirt was suggested, which didn't align strongly with the user's preference for cargo pants and hiking boots.Nonsensical Query Handling: The AI seems to struggle with nonsensical or partially formed user queries. In the
asdfgh fashion trends case, the AI provided a relevant fashion-themed output, but it lacked a strong connection to a meaningful user query, indicating potential difficulties in understanding vague or incomplete inputs.Limited Contextual Understanding: The AI occasionally demonstrated a lack of contextual understanding in suggestions. For instance, in the
office wear query with past purchases involving blazers and pencil skirts, relevant items like blazers and pencil skirts were recommended, but the overall engagement potential wasn't fully realized due to missing data on user interactions like clicks or purchases, suggesting a gap in capturing engagement context. These patterns underscore an opportunity for refining the AI's ability to better align recommendations with nuanced user preferences and contextually interpret less clear inputs effectively.
Conclusions
Is this model production ready?
Almost ready, as the evaluation reveals an overall strong performance with some areas for improvement. The AI achieved a dominant
84.6% in the Perfect category for Relevance Score, demonstrating high accuracy in matching user preferences. For the Trend Accuracy metric, 53.8% is Perfect, showing reasonable proficiency in predicting trends. The Engagement Rate, however, has 67.3% in OK and only 32.7% in Perfect, indicating a need to boost user interaction.In general, the Bad records are less than 3%, which suggests the system is functional and nearly ready for deployment. However, careful monitoring is advised, as the judgment is made with AI's analysis, and there should be a continuous assessment of how it behaves with unexpected inputs.
Future Improvements
Enhance Engagement Mechanisms: Focus on refined strategies to transition Engagement Rate scores from OK to Perfect. This could involve user-centric initiatives like interactive prompts or incentives that encourage users to engage more actively with recommended items. Personalized follow-ups post-recommendation can also drive engagement.
Contextual Understanding: Improve the AI's contextual interpretation ability, especially for scenarios like the
office wear query, where the AI failed to maximize engagement potential. This could be achieved by incorporating more user interaction data to better capture and predict nuanced user behaviors, thereby aligning recommendations more closely with user intents.Integration
How this model is served
The AI model is already deployed and available for use at the endpoint "https://tmmt.ly/:id". This endpoint provides access to the AI's fashion recommendation capabilities via a RESTful API service.
Integration Example
Below are two examples that demonstrate how to integrate this AI into applications. These examples are designed to showcase different use cases for the AI service.
1.Use Case: Personalized Fashion Recommendations in a Mobile App
python
import requests
url = "https://tmmt.ly/:id"
headers = {
"Authorization": "Bearer YOUR_API_KEY",
"Content-Type": "application/json",
}
data = {
"input": {
"user_query": "casual summer dresses",
"interaction_history": "liked items include floral tops, denim skirts"
}
}
response = requests.post(url, headers=headers, json=data)
if response.status_code == 200:
recommendations = response.json()
print(recommendations)
else:
print("Error:", response.status_code, response.text) This script makes a POST request to the API, fetching personalized recommendations for casual summer dresses based on the user's interaction history.
1.Use Case: Integrating Recommendations in an E-commerce Platform Backend
javascript
const fetch = require('node-fetch');
const url = 'https://tmmt.ly/:id';
const headers = {
'Authorization': 'Bearer YOUR_API_KEY',
'Content-Type': 'application/json',
}
const data = {
input: {
user_query: 'evening gowns',
interaction_history: 'purchased cocktail dresses, liked party shoes'
}
};
fetch(url, {
method: 'POST',
headers: headers,
body: JSON.stringify(data)
})
.then(response => response.json())
.then(recommendations => {
console.log('Recommendations received:', recommendations);
})
.catch(error => {
console.error('Error:', error);
}); This Node.js script integrates the AI recommendation system into an e-commerce backend, where it can recommend evening gowns based on previous purchases.
Frontend Example
Frontend Example
Next: How to improve more?
Challenges and Cautions
Understand Limitations: It is crucial for developers to be aware of the limitations inherent in the AI model, such as potential biases in training data, limited contextual understanding beyond its training, and sensitivity to initial training prompts. Continuous monitoring and iterative feedback are essential to mitigate any adverse effects and ensure the AI functions within ethical boundaries.
Potential Improvements
Comprehensive Evaluation: If evaluation has not been thoroughly completed, it is recommended to validate the AI using a large set of diverse and challenging test cases. Leveraging tools like Teammately Agents to synthesize these test cases and employing tailored LLM Judges for large-scale evaluation could substantiate the model's readiness for production.
Integration with Knowledge Bases: Consider enhancing the AI's capabilities by integrating it with external knowledge bases. For example, connecting with Wikipedia, industry-specific databases, or custom-curated repositories could provide real-time, context-rich answers and improve accuracy and depth in responses.
Cost and Latency Reduction: To optimize operational costs and minimize response latency, explore the use of smaller models while maintaining quality. Initiating iterations with smaller architectures and conducting continuous evaluations using LLM Judges can ensure that performance standards are upheld when scaling down.
Enhanced Natural Language Understanding: Continuous advancements in language models could be integrated to improve the AI’s ability to understand and generate more nuanced and contextually relevant responses.
User Feedback Loop: Implementing a feedback loop where real users provide insights on AI interactions can significantly enhance the system by incorporating human-centered improvements. Adaptations based on direct user feedback can lead to a more intuitive and user-friendly experience.
Now it's your turn. Tell what AI you want to build.
AI Agent
Structured Output
Classification model
Marketing Engine