# data-agent **Repository Path**: new_cloud1/data-agent ## Basic Information - **Project Name**: data-agent - **Description**: No description available - **Primary Language**: Unknown - **License**: Not specified - **Default Branch**: master - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 0 - **Created**: 2025-12-29 - **Last Updated**: 2026-01-05 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README # pn Agent * AI (pn 6vgL Python epn ## < 8y' ### (pn - **,{/S~**: ( LLM ĺcI,y{ - **o**: ^ӄ,-ӄo - **pnv**: {< Excel VLOOKUP hsT\ - **ߡ**: GROUP BYZI SQL/Pandas \ - **pn**: ( matplotlib h ### z - Agent 9n`B Python  - FgL c - / BepnA ### ,6 - gLM0 LLM (, - / !gpt-4o / gpt-4o-mini - H(7,K nHh ### hgL - b\(Iqi\ - Apn\pn - gL8U ## =  ### ŝV ```bash # ( uvP uv pip install -e . # ( pip pip install -e . ``` ### Mn  `.env`  ```bash OPENAI_API_KEY=your_openai_api_key_here # Langfuse *(ѧ LANGFUSE_PUBLIC_KEY=your_key LANGFUSE_SECRET_KEY=your_secret LANGFUSE_HOST=https://cloud.langfuse.com ``` ### L( ```bash streamlit run main.py ``` `http://localhost:8501` ( ## = (: ### : 1: ĺ{ ``` (7: ". reviews hĺ {(:o(͈Hĺ" Agent:  # ĺpn reviews = ctx.query("SELECT * FROM reviews") # I{: prompt = " ĺ{:(:o/(͈/Hĺĺ{text}" # y( LLM {HM 10 aK reviews_sample = reviews.head(10) reviews_sample['category'] = ctx.call_llm_batch( reviews_sample['comment'].tolist(), prompt, model='gpt-4o-mini' ) # XӜ ctx.save_table(reviews_sample, 'reviews_classified_sample') print(reviews_sample[['comment', 'category']]) ``` ### : 2: hvVLOOKUP ``` (7: " orders h products h product_id sTwe" Agent:  # $*h orders = ctx.query("SELECT * FROM orders") products = ctx.query("SELECT * FROM products") # gLޥ{< VLOOKUP result = orders.merge( products[['product_id', 'product_name', 'category']], on='product_id', how='left' ) # XӜ ctx.save_table(result, 'orders_enriched') print(f" sT {len(result)} aUpn") ``` ### : 3: ߡ ``` (7: " 0:ߡ.Up" Agent:  # .pn sales = ctx.query("SELECT * FROM sales_data") # Z summary = sales.groupby('region').agg({ 'amount': 'sum', 'order_id': 'count' }).reset_index() summary.columns = ['region', 'total_amount', 'order_count'] # XvU: ctx.save_table(summary, 'sales_by_region') print(summary) ``` ### : 4: pn ``` (7: ";*0:." Agent:  # ߡpn summary = ctx.query("SELECT * FROM sales_by_region") # 6 plt = ctx.plot() plt.figure(figsize=(10, 6)) plt.bar(summary['region'], summary['total_amount']) plt.xlabel('0:') plt.ylabel('.') plt.title('0:.') plt.xticks(rotation=45) ``` ## < ``` (7B LangGraph Agent (B)  Python  (7n CodeExecutor (hgL) DataContext (ЛpnLLM) U:Ӝhpn ``` ### 8 1. **DataContext**: pn - `ctx.query(sql)`: gL SQL - `ctx.save_table(df, name)`: X DataFrame 0pn - `ctx.call_llm_batch(texts, prompt)`: y( LLM - `ctx.show_progress(current, total, msg)`: >:ۦ - `ctx.plot()`: matplotlib a 2. **CodeExecutor**: gL - h;bqi\ - 8U}: - U׌h 3. **DataProcessAgent**: LangGraph Agent - (7B -  Python  - gLA ## = h:6 ###  b\ - `import os`, `import sys` - `eval()`, `exec()` - `open()` \ - `subprocess`, `shutil` I( A!W - `pandas`, `numpy`, `matplotlib` - `datetime`, `json`, `re`, `math` ### gLP6 - 5  - pn LLM API - @ UU ## = , ### ! | ! | (:o | , | |------|---------|------| | gpt-4o | Bo | | | gpt-4o-mini | U{ | N | ### s 1. **H(7,K** ```python #  H 10 aH sample = df.head(10) sample['result'] = ctx.call_llm_batch(...) ``` 2. ** !** ```python #  U( mini ctx.call_llm_batch(texts, prompt, model='gpt-4o-mini') ``` 3. **y** ```python #  ty!'B!p ctx.call_llm_batch(texts, prompt, batch_size=100) ``` ## = :pn $*:h **reviews**: ĺpn - id, comment, rating **sales_data**: .pn - order_id, product, amount, region, date ` pnh ## = ### LK ```bash python test_agent.py ``` ### yӄ ``` data-agent/  main.py # ;  test_agent.py # K,  pyproject.toml # VMn  .env #  data_analysis.db # DuckDB pn  schema_cache.json # pn Schema X  README.md # ,c ``` ## > !. "Ф Issue Pull Request ## = MIT License