Data Protection Impact Assessment (DPIA)
Input from Graidable for the institution's own DPIA (GDPR Art. 35). The institution (controller) is responsible for the final, approved assessment. Not legal advice.
1. Summary
Graidable processes student submissions to generate draft assessments and feedback. The processing involves AI-based evaluation of individuals and potentially large volumes of data, so a DPIA is required/recommended. With the measures below (EU residency, pseudonymisation, Zero Data Retention, human oversight, no model training), residual risk is assessed as low to moderate and acceptable for a limited pilot.
2. Description and data flow
Upload → storage (S3) → OCR (Textract / LandingAI) → LLM assessment (selected provider) → draft back to teacher → human approval. The final grade is set by the human; AI output is a draft only.
3. Purpose and legal basis
Purpose: more efficient, consistent assessment and feedback. Legal basis is set by the institution (typically GDPR Art. 6(1)(e) public task, or (f) legitimate interest, or (a) consent). Processing at Graidable is on the institution's behalf under a DPA (Art. 28). Free-text submissions may inadvertently contain special-category data (Art. 9) — see risk table.
4. Data categories and subjects
Identifiers (name, student ID, email), submission content, assessment data (scores, draft grades, feedback) about students; account/role/log data about teachers and teaching assistants. When used in lower/upper secondary school, children's data is processed, triggering heightened requirements.
5. Sub-processors and transfers
| Sub-processor | Function | Location (configured) | Mechanism |
|---|---|---|---|
| Railway | App hosting, database (Postgres), queue (Redis/RabbitMQ) | EU – EU West (Amsterdam) | DPA + SCC (Railway is a US company; data stored in the EU) |
| AWS S3 | Storage | EU – eu-west-1 (Ireland) | DPA, EU region, encryption |
| AWS Textract | OCR | EU – eu-west-1 (Ireland) | DPA, EU region |
| LandingAI (ADE) | OCR / visual analysis | EU (eu-west-1) or US – on demand | Zero Data Retention available in both regions; local/on-prem hosting for full data ownership (enterprise); SOC 2 Type II |
| LLM via OpenRouter | AI assessment | Configurable: EU / self-hosted / Norway-hosted | Zero retention, no training; SCC if US |
| AWS SES | EU – eu-west-1 (Ireland) | DPA; no submission content | |
| Stripe | Billing | EU (Ireland) | No student data |
The LLM step is the only freely configurable one; the provider can offer EU-hosted, self-hosted or Norway-hosted models for maximum data residency. S3, Textract and LandingAI all run in eu-west-1 for an end-to-end EU-resident chain.
6. Necessity and proportionality
Data minimisation: submissions should be de-identified before being sent to the LLM (internal ID only). The architecture separates identity from submission, making this feasible. Retention: data is kept while the account/class is active and deleted on deletion of a submission/assignment/class/account (including S3 objects on account deletion); pilot data is deleted after the pilot ends.
7. Risk assessment
| # | Risk to the data subject | Likelihood | Impact | Measures |
|---|---|---|---|---|
| R1 | Unauthorised access | Low | High | Encryption, access control, EU hosting, logging |
| R2 | Transfer to a third country (US) | Medium | Medium | OCR/storage in EU; LLM set to EU/local; SCC |
| R3 | Use of data for model training | Low | High | Contractual no-training; ZDR; zero retention |
| R4 | Re-identification | Low | Medium | Pseudonymisation; identity/submission separation |
| R5 | Inaccurate/biased assessment | Medium | High | Mandatory human review; no automated grade |
| R6 | Fully automated decision (Art. 22) | Low | High | Output is a draft; teacher decides |
| R7 | Sensitive data in free text (Art. 9) | Medium | Medium | Data minimisation; access restriction |
| R8 | Insufficient transparency | Medium | Medium | Institution informs students |
8. EU AI Act
AI evaluating learning outcomes is high-risk (Annex III). Human oversight is ensured (draft output, human decides); students are informed; this DPIA, the DPA and the sub-processor list provide documentation; the system performs no emotion recognition or biometric categorisation. The institution is the deployer.
9. Conclusion
With these measures, residual risk is low to moderate and proportionate for a limited pilot, provided EU residency, pseudonymisation and human oversight. Final approval rests with the institution's DPO.