Gentic Data — Documentation

Give your AI agent a full cloud database. Import CSVs from any URL, query with SQL, insert and sync records, and manage tables — all through the Model Context Protocol. Powered by DuckDB via MotherDuck.

1. Getting Started

Sign Up & Get Your API Key

Before you can use Gentic Data, you need an API key to authenticate your requests.

Go to gentic.co/data and create an account.
Create an organization from your dashboard. API keys and billing are scoped to the organization.
Generate an API key and use it as a Bearer token in your MCP client.

2. Connecting to the MCP Server

The server is available at https://mcp.gentic.co/data. For Claude Code:

claude mcp add gentic-data \
  --transport http \
  https://mcp.gentic.co/data \
  --header "Authorization: Bearer YOUR_API_KEY"

For Claude Web and ChatGPT you can also connect via OAuth — no API key needed. See the connect section on the landing page for other MCP clients (n8n, OpenClaw).

3. Agent Skill

For the best results, pair the MCP server with the Gentic Data agent skill. The MCP server gives your agent tool access; the skill teaches it the optimal workflow order. Both the raw SKILL.md and a ready-to-upload .skill bundle are generated on demand from the live manifest, so they always reflect the current tools and pricing.

Add the skill directly via URL:

https://gentic.co/data/SKILL.md

Or upload a .skill bundle to Claude Managed Agents:

https://gentic.co/data/gentic-data.skill

Download this file and upload it wherever Claude Managed Agents asks for a .skill file. It's a zip bundle generated on demand from the latest SKILL.md.

4. When to Apply

User wants to create a table or import data from a CSV, Google Sheet, or S3 URL.
User wants to query, analyze, or explore their data — tables, columns, sample rows, aggregates.
User wants to preview a table's structure, schema, or row count.
User wants to update, sync, upsert, or append data from a CSV.
User wants to insert one or more records into a table with duplicate prevention.
User wants to search text data semantically or find similar records.
User asks about their "database", "tables", or "data" in general.

5. Workflow

1. Start with `list_database_tables`
For any 'what data do I have' question, start with `list_database_tables`. It's free, returns all tables with row counts, and anchors the rest of the conversation. If the user names a table directly, skip to `sample_table` or `get_table_schema`.
2. Import with `create_table_from_csv`
Accepts HTTPS, S3, Google Sheets, and Google Drive URLs — the last two are auto-converted to direct CSV downloads. Files must be publicly accessible. Table names must be letters/numbers/underscores only. For semantic search, pass `embed_columns` to vectorize text columns — this is the only paid tool at 1¢/row, so mention the cost when importing large datasets. CSV files are capped at 100 MB.
3. Always `sample_table` before writing SQL
Before running `query_data` or inserting records, call `sample_table` or `get_table_schema` to see the real column names and types. Don't guess. `sample_table` gives you a feel for the data; `get_table_schema` gives you precise types. `query_data` is strictly read-only — only SELECT and WITH (CTEs), no INSERT/UPDATE/DELETE/DROP, no `read_csv_auto()` / `read_parquet()` / `glob()`.
4. Pick the right update tool
`update_table_from_csv` is the general case with three modes: `replace` (drop + recreate), `append` (add all rows, allows duplicates), `upsert` (add only new rows keyed on `unique_column`). `sync_table_from_csv` is the best answer when the user says "sync", "update", or "refresh" — updates existing rows and inserts new ones in one call. `batch_update_table_from_csv` only touches existing rows (use for corrections). **Always ask which column contains unique identifiers** — don't default to append.
5. Insert records safely
`insert_record` for a single row, `batch_insert_records` (up to 1000 rows) for many. Both require `unique_column` for duplicate prevention — `insert_record` rejects duplicates, `batch_insert_records` silently skips them. Always `sample_table` first so you know the required columns.
6. Semantic search via `_vectors` tables
`search_structured` runs cosine-similarity search over a `_vectors` table created with `embed_columns`. Key params: `table_name` must end in `_vectors`, `query` is natural language, `filters` is an optional SQL WHERE clause for structured columns (validated for safety), `limit` defaults to 20 (max 100). Use this when the user asks to find 'similar' items or to search by meaning rather than exact match.
7. Present results clearly
Don't dump raw JSON. For query results, render a markdown table with column headers; for large result sets, show a head/tail and summarize. For imports and updates, confirm the row counts that changed. For semantic search, lead with the top match and why it's relevant. Always end with a concrete next step (another query, a chained MCP call, an export).

6. Tool Reference

15 tools, rendered live from the Gentic MCP manifest. Parameter tables come directly from each tool's JSON Schema.

batch_insert_records

Free

Insert multiple records into a table in one batch operation with duplicate prevention. All records must have the same columns. Maximum 1000 records per call. Use sample_table first to see the table structure.

Parameter	Type	Description
`table_name` required	`string`	Name of the table to insert into
`records` required	`object[]`	Array of records to insert (all must have the same columns)
`unique_column` required	`string`	Column to check for duplicates (e.g. 'id', 'email'). Records with existing values are skipped.

batch_update_table_from_csv

Free

Batch update ONLY existing records in a table from a CSV URL (ignores new records). If your CSV has both updates AND new records, use sync_table_from_csv instead.

Parameter	Type	Description
`table_name` required	`string`	Name of the existing table to update
`csv_url` required	`string`	CSV URL with updated data (supports Google Sheets, Drive, S3, HTTPS)
`unique_column` required	`string`	Column to match records (e.g. 'id', 'email'). Only matching records are updated.

create_table_from_csv

Free

Create a new table by importing a CSV file. Supports S3, HTTPS, Google Sheets, and Google Drive URLs. If the user wants to SEARCH, EXPLORE, or FIND PATTERNS in text-heavy data (reviews, feedback, support tickets, survey responses, comments), set embed_columns to the text columns that should be searchable. This creates a '_vectors' table with per-row embeddings that can be searched semantically using the search_structured tool. Processing is async and billed at 1¢/row. If the user just wants to IMPORT data for SQL analytics (counts, averages, aggregations), omit embed_columns for a plain synchronous import. Use query_data for SQL analysis afterward.

Parameter	Type	Description
`table_name` required	`string`	Base name for the table (e.g. 'reviews', 'support_tickets'). If embed_columns is provided, '_vectors' is appended automatically (e.g. 'reviews' → 'reviews_vectors').
`csv_url` required	`string`	URL to a publicly accessible CSV file. Google Sheets and Drive URLs are auto-converted.
`embed_columns`	`string[]`	Text columns to embed for semantic search (e.g. ['review_title', 'review_text']). REQUIRED when the user wants to search/explore/find patterns in text data. Only include natural language columns — skip IDs, dates, and numbers. Omit entirely for a plain CSV import.
`column_types`	`object`	Explicit DuckDB type overrides for columns (e.g. {"submission_date": "TIMESTAMP"}). Only used with embed_columns.
`unique_column`	`string`	Optional column that uniquely identifies a row (e.g. 'id', 'email'). If set, the import is de-duplicated on this column so the table starts with one row per key (last occurrence in the CSV wins) — use the SAME column later as unique_column in sync_table_from_csv. Note: this is enforced at import time, not by a database constraint (DuckLake does not support UNIQUE/PRIMARY KEY).

delete_record

Free

Delete a record, matched by a key column. The single-row counterpart to insert_record. Deletes EVERY row matching the key, so use a unique column (e.g. id/email/url) to remove exactly one — deleted_count reflects how many rows matched. Use sample_table first to confirm the row. Returns deleted_count — 0 means no row matched the key (a no-op, not an error). Irreversible.

Parameter	Type	Description
`table_name` required	`string`	Name of the table to delete from
`unique_column` required	`string`	Column that identifies the row to delete (e.g. 'id', 'email', 'episode_url')
`key_value` required	`any`	Value of unique_column identifying the row to delete

get_table_schema

Free

Get detailed schema information for a table including column names, data types, and nullability.

Parameter	Type	Description
`table_name` required	`string`	Name of the table

insert_record

Free

Insert a single record into a table with duplicate prevention. Use sample_table first to see the table structure and required columns.

Parameter	Type	Description
`table_name` required	`string`	Name of the table to insert into
`record_data` required	`object`	Record to insert as key-value pairs (e.g. {"id": 123, "name": "John", "email": "john@example.com"})
`unique_column` required	`string`	Column to check for duplicates (e.g. 'id', 'email'). Insert is rejected if value already exists.

list_database_tables

Free

List all tables in your Gentic Data database with row counts. Use this to see what data you have available.

This tool takes no parameters.

query_data

Free

Execute a SQL SELECT query for data analysis — aggregations, counts, averages, GROUP BY, JOINs, filtering, and reporting. Works on all tables including '_vectors' tables. Supports SELECT and WITH (CTEs). Write operations and file-reading functions are blocked. Use sample_table first to understand the table structure. Do NOT use this for semantic/natural language search — use search_structured instead.

Parameter	Type	Description
`sql_query` required	`string`	SQL SELECT query to execute (e.g. "SELECT * FROM sales WHERE date > '2024-01-01' LIMIT 10")

recall_memory

Free

Search long-term cross-conversation memory captured from Slack and Telegram threads. Answers questions like 'what did we decide last week?' or 'what has the team said about X?'. Vector-searches the org-scoped `org_memory` table and returns rows ranked by semantic relevance. Each row includes text, role, surface, thread_key, agent_name, created_at, and similarity_score. Use surface/role/since/until/thread_key filters to narrow scope. Free.

Parameter	Type	Description
`query` required	`string`	Natural-language question — what do you want to recall from prior conversations? e.g. 'what did we decide about pricing last week?' or 'what has the team said about onboarding flow?'
`limit` required	`integer`	Max memory rows to return (default 10, max 50). 1 – 50 · default: 10
`surface`	`string`	Restrict results to a single surface. enum: slack, telegram
`since`	`string`	Only include memories created on/after this ISO date (e.g. '2026-04-01').
`until`	`string`	Only include memories created on/before this ISO date.
`role`	`string`	Filter by message author — 'user' for human turns, 'assistant' for agent replies. enum: user, assistant
`thread_key`	`string`	Pin retrieval to a single conversation thread (matches the thread_key written by the capture side).

sample_table

Free

Get a preview of a table with sample records and column information. Use this to understand the table structure before running analysis queries.

Parameter Type Description

Parameter	Type	Description
`table_name` required	`string`	Name of the table to sample
`sample_size` required	`integer`	Number of sample records to return (default: 5, max: 20) 1 – 20 · default: 5

table_name

required

string

Name of the table to sample

sample_size

required

integer

Number of sample records to return (default: 5, max: 20)

1 – 20 · default: 5

search_structured

Free

Search through text data using natural language. Use this when the user wants to FIND, SEARCH, or EXPLORE specific topics, themes, or patterns in a '_vectors' table (created via create_table_from_csv with embed_columns). Returns full rows ranked by semantic relevance. Supports optional SQL filters to narrow results by date, rating, category, etc. Do NOT use this for SQL analytics (counts, averages, GROUP BY) — use query_data instead. Example: search_structured(table_name='reviews_vectors', query='shipping delays and damaged packaging', filters="rating <= 2 AND date > '2024-06-01'")

Parameter	Type	Description
`table_name` required	`string`	The vector table to search (e.g. 'reviews_vectors'). Must be a table created with embed_columns.
`query` required	`string`	Natural language search query (e.g. 'shipping delays and poor packaging')
`filters`	`string`	Optional SQL WHERE conditions for structured columns (e.g. "submission_date > '2024-06-01' AND rating <= 2")
`limit` required	`integer`	Max results to return (default: 20, max: 100) 1 – 100 · default: 20
`columns`	`string[]`	Which columns to return (defaults to all non-embedding columns)

sync_table_from_csv

Free

Sync a table with CSV data: updates existing records AND adds new ones in a single operation. This is the recommended tool when a user asks to 'sync', 'update', or 'refresh' a table from a data source.

Parameter	Type	Description
`table_name` required	`string`	Name of the existing table to sync
`csv_url` required	`string`	CSV URL with the data to sync (supports Google Sheets, Drive, S3, HTTPS)
`unique_column` required	`string`	Column to match records between CSV and table (e.g. 'id', 'email', 'video_url')
`embed_columns`	`string[]`	For a '_vectors' table (created via create_table_from_csv with embed_columns): the text column(s) to embed. When set, newly inserted rows are embedded so they're searchable via search_structured (async, billed 1¢/embedded row). Omit for a plain table; if the table has embeddings and you omit this, new rows will NOT be semantically searchable.

update_record

Free

Update the fields of an existing record, matched by a key column. The single-row counterpart to insert_record (the other update tools all require a CSV URL). Updates EVERY row matching the key, so use a unique column (e.g. id/email/url) to affect exactly one — updated_count reflects how many rows matched. Use sample_table first to see the columns. Returns updated_count — 0 means no row matched the key (a no-op, not an error).

Parameter	Type	Description
`table_name` required	`string`	Name of the table to update
`unique_column` required	`string`	Column that identifies the row to update (e.g. 'id', 'email', 'episode_url')
`key_value` required	`any`	Value of unique_column identifying the row to update
`updates` required	`object`	Columns to set as key-value pairs (e.g. {"status": "published", "title": "New title"}). Must contain at least one column.

update_schema

Free

Alter an existing table's schema. The `operation` selects what to do — currently the only supported operation is "add_column" (more may be added later). add_column is additive and non-destructive: existing rows get NULL (or the given default), and it's idempotent (re-adding an existing column is a clean no-op, added=false). Use sample_table/get_table_schema first to see current columns.

Parameter	Type	Description
`table_name` required	`string`	Name of the table to alter
`operation` required	`string`	Schema change to perform. Only "add_column" is supported today. enum: add_column
`column_name` required	`string`	[add_column] Name of the new column (letters, numbers, underscores)
`type` required	`string`	[add_column] Column type. One of: varchar, text, integer, bigint, double, decimal, boolean, date, timestamp, time, json (case-insensitive).
`default`	`any`	[add_column] Optional default value for existing + future rows (must match the column type). Omit to leave existing rows NULL.

update_table_from_csv

Free

Update an existing table from a CSV URL. Supports three modes: 'replace' (drop and recreate), 'append' (add all rows — may create duplicates), or 'upsert' (add only new rows based on unique_column). Use sample_table first to check columns.

Parameter	Type	Description
`table_name` required	`string`	Name of the existing table to update
`csv_url` required	`string`	CSV URL with the new data (supports Google Sheets, Drive, S3, HTTPS)
`mode` required	`string`	Update mode: 'replace' (overwrite all data), 'append' (add all rows), 'upsert' (add only new rows) enum: replace, append, upsert · default: "replace"
`unique_column`	`string`	Column to check for uniqueness (required when mode='upsert'). Example: 'id', 'email'

7. Pricing

Pricing is pulled live from the Gentic MCP manifest. All prices are per call and deducted from your Gentic credits.

Tool	Cost
batch_insert_records	Free
batch_update_table_from_csv	Free
create_table_from_csv	Free
delete_record	Free
get_table_schema	Free
insert_record	Free
list_database_tables	Free
query_data	Free
recall_memory	Free
sample_table	Free
search_structured	Free
sync_table_from_csv	Free
update_record	Free
update_schema	Free
update_table_from_csv	Free

8. Notes

All tools are organization-scoped — users only see their own database and tables.
All tools are free **except** vectorized imports via `create_table_from_csv` with `embed_columns` at 1¢/row. A 10k-row vectorized import is $100 — always surface the cost before running.
Table and column names can only contain letters, numbers, and underscores.
`query_data` is read-only. Use the insert/update tools for writes.
CSV files are capped at 100 MB per import.
Users don't need to 'create a database' — their database is provisioned automatically on first use.

Gentic Data — Documentation

1. Getting Started

Sign Up & Get Your API Key

2. Connecting to the MCP Server

3. Agent Skill

4. When to Apply

5. Workflow

1. Start with `list_database_tables`

2. Import with `create_table_from_csv`

3. Always `sample_table` before writing SQL

4. Pick the right update tool

5. Insert records safely

6. Semantic search via `_vectors` tables

7. Present results clearly

6. Tool Reference

batch_insert_records

batch_update_table_from_csv

create_table_from_csv

delete_record

get_table_schema

insert_record

list_database_tables

query_data

recall_memory

sample_table

search_structured

sync_table_from_csv

update_record

update_schema

update_table_from_csv

7. Pricing

8. Notes