Training data that tells the truth.

Corpora in Québec French, created and verified by certified experts. To train models faithful to our language, our law and our reality.

Three ways to work together.

01

From catalog

Ready-to-use datasets, by domain, created by our Québec experts: law, health, finance, technical.

02

Custom

We create data to your exact specifications. 12-month exclusivity included.

03

Exclusive

Data created only for you, for a lasting advantage. Yours alone.

Pricing on request. Let's talk about your needs.

The domains we cover

Nine broad families, more than 20 domains, documented by professionals who practice them every day.
Law & justice
Civil · case law · taxation
Health & life sciences
Medicine · pharmacy · nursing · nutrition
Sciences & mathematics
Math · physics · chemistry · biology
Engineering & technology
Civil · mechanical · software · computer science
Finance & management
Accounting (CPA) · management · real estate
Education & social sciences
Pedagogy · social work · psychology
Language & linguistics
Registers · varieties of fr-CA
Culture & creation
Film · design · heritage
Environment & agriculture
Standards · agri-food

Verified, traced, defensible.

Rigor before volume. Every piece of knowledge passes through three hands before entering the base.

01

Produced by an expert

02

Reviewed by a peer

03

Validated by our AI team

Provenance traced end to end, 100% created by humans.

Legal, ethical, no gray areas.

While others face lawsuits for scraping, our data is irreproachable.

Clear ownership

Created from scratch. Veridak owns 100% of the rights and transfers them to you by contract.

Zero scraping

Nothing is pulled from the web. Built specifically for training, and therefore defensible.

Compliance

Law 25, GDPR and the EU AI Act. Full traceability of the creation chain.

Ethical by design

Experts paid fairly, informed consent, no real personal data.

Delivered in your format.

JSONLCSVFormat on requestDocumentation included
01

Analysis

We understand your goals, your domain and your model gaps.

02

Creation

Our experts create rich data, with detailed reasoning.

03

Double QA

Review by a domain expert, then validation by our AI/NLP team.

04

Delivery

In your preferred format, with full documentation and support.

Let's talk about your
data.

Describe your needs or request a sample: we reply within 24 to 48 hours.