Gentic Data — Documentation

Give your AI agent a full cloud database. Import CSVs from any URL, query with SQL, insert and sync records, and manage tables — all through the Model Context Protocol. Powered by DuckDB via MotherDuck.

1. Getting Started

Sign Up & Get Your API Key

Before you can use Gentic Data, you need an API key to authenticate your requests.

  1. Go to gentic.co/data and create an account.
  2. Create an organization from your dashboard. API keys and billing are scoped to the organization.
  3. Generate an API key and use it as a Bearer token in your MCP client.

2. Connecting to the MCP Server

The server is available at https://mcp.gentic.co/data. For Claude Code:

claude mcp add gentic-data \
  --transport http \
  https://mcp.gentic.co/data \
  --header "Authorization: Bearer YOUR_API_KEY"

For Claude Web and ChatGPT you can also connect via OAuth — no API key needed. See the connect section on the landing page for other MCP clients (n8n, OpenClaw).

3. Agent Skill

For the best results, pair the MCP server with the Gentic Data agent skill. The MCP server gives your agent tool access; the skill teaches it the optimal workflow order. Both the raw SKILL.md and a ready-to-upload .skill bundle are generated on demand from the live manifest, so they always reflect the current tools and pricing.

Add the skill directly via URL:

https://gentic.co/data/SKILL.md

Or upload a .skill bundle to Claude Managed Agents:

https://gentic.co/data/gentic-data.skill

Download this file and upload it wherever Claude Managed Agents asks for a .skill file. It's a zip bundle generated on demand from the latest SKILL.md.

4. When to Apply

  • User wants to create a table or import data from a CSV, Google Sheet, or S3 URL.
  • User wants to query, analyze, or explore their data — tables, columns, sample rows, aggregates.
  • User wants to preview a table's structure, schema, or row count.
  • User wants to update, sync, upsert, or append data from a CSV.
  • User wants to insert one or more records into a table with duplicate prevention.
  • User wants to search text data semantically or find similar records.
  • User asks about their "database", "tables", or "data" in general.

5. Workflow

  1. 1. Start with `list_database_tables`

    For any 'what data do I have' question, start with `list_database_tables`. It's free, returns all tables with row counts, and anchors the rest of the conversation. If the user names a table directly, skip to `sample_table` or `get_table_schema`.

  2. 2. Import with `create_table_from_csv`

    Accepts HTTPS, S3, Google Sheets, and Google Drive URLs — the last two are auto-converted to direct CSV downloads. Files must be publicly accessible. Table names must be letters/numbers/underscores only. For semantic search, pass `embed_columns` to vectorize text columns — this is the only paid tool at 5¢/row, so mention the cost when importing large datasets. CSV files are capped at 100 MB.

  3. 3. Always `sample_table` before writing SQL

    Before running `query_data` or inserting records, call `sample_table` or `get_table_schema` to see the real column names and types. Don't guess. `sample_table` gives you a feel for the data; `get_table_schema` gives you precise types. `query_data` is strictly read-only — only SELECT and WITH (CTEs), no INSERT/UPDATE/DELETE/DROP, no `read_csv_auto()` / `read_parquet()` / `glob()`.

  4. 4. Pick the right update tool

    `update_table_from_csv` is the general case with three modes: `replace` (drop + recreate), `append` (add all rows, allows duplicates), `upsert` (add only new rows keyed on `unique_column`). `sync_table_from_csv` is the best answer when the user says "sync", "update", or "refresh" — updates existing rows and inserts new ones in one call. `batch_update_table_from_csv` only touches existing rows (use for corrections). **Always ask which column contains unique identifiers** — don't default to append.

  5. 5. Insert records safely

    `insert_record` for a single row, `batch_insert_records` (up to 1000 rows) for many. Both require `unique_column` for duplicate prevention — `insert_record` rejects duplicates, `batch_insert_records` silently skips them. Always `sample_table` first so you know the required columns.

  6. 6. Semantic search via `_vectors` tables

    `search_structured` runs cosine-similarity search over a `_vectors` table created with `embed_columns`. Key params: `table_name` must end in `_vectors`, `query` is natural language, `filters` is an optional SQL WHERE clause for structured columns (validated for safety), `limit` defaults to 20 (max 100). Use this when the user asks to find 'similar' items or to search by meaning rather than exact match.

  7. 7. Present results clearly

    Don't dump raw JSON. For query results, render a markdown table with column headers; for large result sets, show a head/tail and summarize. For imports and updates, confirm the row counts that changed. For semantic search, lead with the top match and why it's relevant. Always end with a concrete next step (another query, a chained MCP call, an export).

6. Tool Reference

11 tools, rendered live from the Gentic MCP manifest. Parameter tables come directly from each tool's JSON Schema.

batch_insert_records

Free

Insert multiple records into a table in one batch operation with duplicate prevention. All records must have the same columns. Maximum 1000 records per call. Use sample_table first to see the table structure.

ParameterTypeDescription
table_name
required
string

Name of the table to insert into

records
required
object[]

Array of records to insert (all must have the same columns)

unique_column
required
string

Column to check for duplicates (e.g. 'id', 'email'). Records with existing values are skipped.

batch_update_table_from_csv

Free

Batch update ONLY existing records in a table from a CSV URL (ignores new records). If your CSV has both updates AND new records, use sync_table_from_csv instead.

ParameterTypeDescription
table_name
required
string

Name of the existing table to update

csv_url
required
string

CSV URL with updated data (supports Google Sheets, Drive, S3, HTTPS)

unique_column
required
string

Column to match records (e.g. 'id', 'email'). Only matching records are updated.

create_table_from_csv

Free

Create a new table by importing a CSV file. Supports S3, HTTPS, Google Sheets, and Google Drive URLs. If the user wants to SEARCH, EXPLORE, or FIND PATTERNS in text-heavy data (reviews, feedback, support tickets, survey responses, comments), set embed_columns to the text columns that should be searchable. This creates a '_vectors' table with per-row embeddings that can be searched semantically using the search_structured tool. Processing is async and billed at 5¢/row. If the user just wants to IMPORT data for SQL analytics (counts, averages, aggregations), omit embed_columns for a plain synchronous import. Use query_data for SQL analysis afterward.

ParameterTypeDescription
table_name
required
string

Base name for the table (e.g. 'reviews', 'support_tickets'). If embed_columns is provided, '_vectors' is appended automatically (e.g. 'reviews' → 'reviews_vectors').

csv_url
required
string

URL to a publicly accessible CSV file. Google Sheets and Drive URLs are auto-converted.

embed_columnsstring[]

Text columns to embed for semantic search (e.g. ['review_title', 'review_text']). REQUIRED when the user wants to search/explore/find patterns in text data. Only include natural language columns — skip IDs, dates, and numbers. Omit entirely for a plain CSV import.

column_typesobject

Explicit DuckDB type overrides for columns (e.g. {"submission_date": "TIMESTAMP"}). Only used with embed_columns.

get_table_schema

Free

Get detailed schema information for a table including column names, data types, and nullability.

ParameterTypeDescription
table_name
required
string

Name of the table

insert_record

Free

Insert a single record into a table with duplicate prevention. Use sample_table first to see the table structure and required columns.

ParameterTypeDescription
table_name
required
string

Name of the table to insert into

record_data
required
object

Record to insert as key-value pairs (e.g. {"id": 123, "name": "John", "email": "john@example.com"})

unique_column
required
string

Column to check for duplicates (e.g. 'id', 'email'). Insert is rejected if value already exists.

list_database_tables

Free

List all tables in your Gentic Data database with row counts. Use this to see what data you have available.

This tool takes no parameters.

query_data

Free

Execute a SQL SELECT query for data analysis — aggregations, counts, averages, GROUP BY, JOINs, filtering, and reporting. Works on all tables including '_vectors' tables. Supports SELECT and WITH (CTEs). Write operations and file-reading functions are blocked. Use sample_table first to understand the table structure. Do NOT use this for semantic/natural language search — use search_structured instead.

ParameterTypeDescription
sql_query
required
string

SQL SELECT query to execute (e.g. "SELECT * FROM sales WHERE date > '2024-01-01' LIMIT 10")

sample_table

Free

Get a preview of a table with sample records and column information. Use this to understand the table structure before running analysis queries.

ParameterTypeDescription
table_name
required
string

Name of the table to sample

sample_size
required
integer

Number of sample records to return (default: 5, max: 20)

1 – 20 · default: 5

search_structured

Free

Search through text data using natural language. Use this when the user wants to FIND, SEARCH, or EXPLORE specific topics, themes, or patterns in a '_vectors' table (created via create_table_from_csv with embed_columns). Returns full rows ranked by semantic relevance. Supports optional SQL filters to narrow results by date, rating, category, etc. Do NOT use this for SQL analytics (counts, averages, GROUP BY) — use query_data instead. Example: search_structured(table_name='reviews_vectors', query='shipping delays and damaged packaging', filters="rating <= 2 AND date > '2024-06-01'")

ParameterTypeDescription
table_name
required
string

The vector table to search (e.g. 'reviews_vectors'). Must be a table created with embed_columns.

query
required
string

Natural language search query (e.g. 'shipping delays and poor packaging')

filtersstring

Optional SQL WHERE conditions for structured columns (e.g. "submission_date > '2024-06-01' AND rating <= 2")

limit
required
integer

Max results to return (default: 20, max: 100)

1 – 100 · default: 20

columnsstring[]

Which columns to return (defaults to all non-embedding columns)

sync_table_from_csv

Free

Sync a table with CSV data: updates existing records AND adds new ones in a single operation. This is the recommended tool when a user asks to 'sync', 'update', or 'refresh' a table from a data source.

ParameterTypeDescription
table_name
required
string

Name of the existing table to sync

csv_url
required
string

CSV URL with the data to sync (supports Google Sheets, Drive, S3, HTTPS)

unique_column
required
string

Column to match records between CSV and table (e.g. 'id', 'email', 'video_url')

update_table_from_csv

Free

Update an existing table from a CSV URL. Supports three modes: 'replace' (drop and recreate), 'append' (add all rows — may create duplicates), or 'upsert' (add only new rows based on unique_column). Use sample_table first to check columns.

ParameterTypeDescription
table_name
required
string

Name of the existing table to update

csv_url
required
string

CSV URL with the new data (supports Google Sheets, Drive, S3, HTTPS)

mode
required
string

Update mode: 'replace' (overwrite all data), 'append' (add all rows), 'upsert' (add only new rows)

enum: replace, append, upsert · default: "replace"

unique_columnstring

Column to check for uniqueness (required when mode='upsert'). Example: 'id', 'email'

7. Pricing

Pricing is pulled live from the Gentic MCP manifest. All prices are per call and deducted from your Gentic credits.

ToolCost
batch_insert_recordsFree
batch_update_table_from_csvFree
create_table_from_csvFree
get_table_schemaFree
insert_recordFree
list_database_tablesFree
query_dataFree
sample_tableFree
search_structuredFree
sync_table_from_csvFree
update_table_from_csvFree

8. Notes

  • All tools are organization-scoped — users only see their own database and tables.
  • All tools are free **except** vectorized imports via `create_table_from_csv` with `embed_columns` at 5¢/row. A 10k-row vectorized import is $500 — always surface the cost before running.
  • Table and column names can only contain letters, numbers, and underscores.
  • `query_data` is read-only. Use the insert/update tools for writes.
  • CSV files are capped at 100 MB per import.
  • Users don't need to 'create a database' — their database is provisioned automatically on first use.