This document describes the public API endpoints for converting between DDI Codebook XML format and XLSForm JSON format.
The XLSForm JSON mirrors the actual XLSForm spreadsheet structure with three sheets:
- survey: questions and groups (columns: type, name, label, hint, required, appearance, parameters)
- choices: answer options for select questions (columns: list_name, name, label)
- settings: form metadata (columns: form_title, form_id, version)
Endpoint: POST /api/convert/ddi-to-xlsform
Access: Public (no authentication required)
Request:
- Content-Type:
application/xmlortext/xml - Body: DDI XML fragment — a
<var>,<varGrp>, or<dataDscr>wrapper (forselect_multiple/multipleRespgroups containing both<varGrp>and<var>elements)
Response:
- Content-Type:
application/json - Body: XLSForm JSON with survey, choices, and settings sheets
Example:
curl -X POST http://localhost:8090/api/convert/ddi-to-xlsform \
-H "Content-Type: application/xml" \
--data '<var ID="V1" name="gender" intrvl="discrete">
<concept>Gender</concept>
<qstn responseDomainType="category">
<qstnLit>What is your gender?</qstnLit>
</qstn>
<catgry>
<catValu>1</catValu>
<labl>Male</labl>
</catgry>
<catgry>
<catValu>2</catValu>
<labl>Female</labl>
</catgry>
<varFormat type="numeric" schema="other"/>
</var>'Response:
{
"survey": [
{
"type": "select_one gender",
"name": "gender",
"label": "What is your gender?"
}
],
"choices": [
{
"list_name": "gender",
"name": "1",
"label": "Male"
},
{
"list_name": "gender",
"name": "2",
"label": "Female"
}
],
"settings": {}
}Endpoint: POST /api/convert/xlsform-to-ddi
Access: Public (no authentication required)
Request:
- Content-Type:
application/json - Body: XLSForm JSON with survey, choices, and settings sheets
Response:
- Content-Type:
application/xml - Body: DDI XML — a single
<var>for simple questions, or a<dataDscr>wrapper containing<varGrp type="multipleResp">+ binary<var>elements forselect_multiplequestions
Example:
curl -X POST http://localhost:8090/api/convert/xlsform-to-ddi \
-H "Content-Type: application/json" \
--data '{
"survey": [
{"type": "select_one gender", "name": "gender", "label": "What is your gender?"}
],
"choices": [
{"list_name": "gender", "name": "1", "label": "Male"},
{"list_name": "gender", "name": "2", "label": "Female"}
],
"settings": {}
}'Response:
<?xml version="1.0" encoding="UTF-8"?>
<var ID="V_gender" name="gender" intrvl="discrete">
<concept>What is your gender?</concept>
<qstn responseDomainType="category">
<qstnLit>What is your gender?</qstnLit>
</qstn>
<catgry>
<catValu>1</catValu>
<labl>Male</labl>
</catgry>
<catgry>
<catValu>2</catValu>
<labl>Female</labl>
</catgry>
<varFormat type="numeric" schema="other"></varFormat>
</var>The JSON format mirrors the three sheets of an XLSForm spreadsheet:
Each row in the survey sheet is an object with these columns:
| Column | Required | Description |
|---|---|---|
type |
Yes | Answer type. For select questions, includes list_name: select_one <list_name> |
name |
Yes | Variable identifier (snake_case recommended) |
label |
No | Question text shown to respondents |
hint |
No | Additional hint text (maps to DDI preQTxt) |
required |
No | "yes" if the question is mandatory |
appearance |
No | Display preference |
parameters |
No | Key-value pairs, e.g. "guidance_hint=Show card" |
Groups use begin_group/end_group rows:
{
"survey": [
{"type": "begin_group", "name": "demographics", "label": "Demographics"},
{"type": "integer", "name": "age", "label": "What is your age?"},
{"type": "end_group", "name": ""}
]
}Section drop rule: qwacback's Schematron restricts varGrp/@type to grid, multipleResp, or other. Generic sections have no valid representation, so the XLSForm → DDI converter drops plain begin_group wrappers and flattens members to top-level <var> elements. A group is kept (as <varGrp type="grid">) only when it signals a grid layout: appearance: "table-list", "matrix" in the label, or "grid" in the name.
Grid emission: For grid groups, the begin_group label is emitted as <txt> on the <varGrp> and repeated as <preQTxt> on every member variable (matching the DDI convention that the lead-in question appears once at the group level and is echoed per row).
Each row in the choices sheet is an object with these columns:
| Column | Required | Description |
|---|---|---|
list_name |
Yes | References the list in the survey type column |
name |
Yes | Choice value (e.g. "1", "2") |
label |
Yes | Choice display text |
Single object with optional form metadata:
| Column | Required | Description |
|---|---|---|
form_title |
No | Form title |
form_id |
No | Form identifier |
version |
No | Form version |
A single question in DDI is represented by a <var> element. Here's the structure:
<var ID="V1" name="variable_name" intrvl="discrete">
<concept>Variable concept/label</concept>
<qstn responseDomainType="category">
<preQTxt>Optional introductory text</preQTxt>
<qstnLit>The actual question text</qstnLit>
<ivuInstr>Optional interviewer instructions</ivuInstr>
</qstn>
<catgry>
<catValu>1</catValu>
<labl>Option 1</labl>
</catgry>
<catgry>
<catValu>2</catValu>
<labl>Option 2</labl>
</catgry>
<varFormat type="numeric" schema="other"/>
</var>When questions are part of a group (like a matrix or multiple response set), use a <varGrp> element:
<varGrp ID="VG1" name="satisfaction_group" type="grid" var="V1 V2 V3">
<concept>Group concept/label</concept>
<txt>Optional introductory text for the group</txt>
</varGrp>The var attribute contains space-separated IDs of the variables that belong to this group.
To create a complete DDI codebook with single questions, wrap them in the full structure:
<?xml version="1.0" encoding="UTF-8"?>
<codeBook xmlns="ddi:codebook:2_5" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<stdyDscr>
<citation>
<titlStmt>
<titl>Study Title</titl>
<IDNo>study-id-123</IDNo>
</titlStmt>
</citation>
<stdyInfo>
<abstract>Study abstract</abstract>
<sumDscr>
<timePrd>2024</timePrd>
<nation>Country</nation>
<anlyUnit>Analysis unit</anlyUnit>
<universe>Study universe</universe>
<dataKind>Survey Data</dataKind>
</sumDscr>
</stdyInfo>
</stdyDscr>
<dataDscr>
<!-- Variable groups go here (if any) -->
<varGrp ID="VG1" name="group1" type="grid" var="V1 V2">
<concept>Group concept</concept>
<txt>Group description</txt>
</varGrp>
<!-- Individual variables go here -->
<var ID="V1" name="question1" intrvl="discrete">
<concept>Question 1</concept>
<qstn responseDomainType="category">
<qstnLit>What is your answer?</qstnLit>
</qstn>
<catgry>
<catValu>1</catValu>
<labl>Yes</labl>
</catgry>
<catgry>
<catValu>2</catValu>
<labl>No</labl>
</catgry>
<varFormat type="numeric" schema="other"/>
</var>
<var ID="V2" name="question2" intrvl="discrete">
<!-- ... -->
</var>
</dataDscr>
</codeBook>| DDI responseDomainType | XLSForm type | Notes |
|---|---|---|
numeric |
integer |
Numeric input |
text |
text |
Text input |
category |
select_one <name> |
Single choice (list_name = variable name) |
category + vocab |
select_one_from_file <vocab>.csv |
Long list single choice (external vocab) |
multiple |
select_multiple <name> |
Multiple choice (list_name = variable name) |
multiple + vocab |
select_multiple_from_file <vocab>.csv |
Long list multiple choice (external vocab) |
| XLSForm type | DDI output | DDI responseDomainType | intrvl | varFormat.type |
|---|---|---|---|---|
integer, decimal, range |
<var> |
numeric |
contin |
numeric |
text, note |
<var> |
text |
discrete |
character |
select_one, matrix |
<var> + <catgry> |
category |
discrete |
numeric |
select_multiple |
<varGrp type="multipleResp"> + binary <var> per choice |
multiple |
discrete |
numeric |
select_one_from_file |
<var> with concept/@vocab (no catgry) |
category |
discrete |
numeric |
select_multiple_from_file |
<var> with concept/@vocab (no catgry) |
multiple |
discrete |
numeric |
Note on select_multiple: Per DDI Codebook conventions, checkboxes are represented as a <varGrp type="multipleResp"> with one binary <var> per choice option. Each binary variable has categories with catValu 0 and 1 (no labl — the variable name and qstnLit already describe the checkbox option). The output is wrapped in a <dataDscr> element.
Note on select_*_from_file: For long lists referencing external vocabularies (e.g. select_one_from_file iso_3166_1.csv), the DDI output uses concept/@vocab instead of inline <catgry> elements. The vocab is the standard code (e.g. iso_3166_1), and the XLSForm CSV filename is derived by appending .csv.
Both endpoints return appropriate HTTP status codes:
- 200 OK: Successful conversion
- 400 Bad Request: Invalid input format or conversion error
- 413 Payload Too Large: Request body exceeds 50MB limit
Error responses include a descriptive message:
{
"code": 400,
"message": "Failed to convert DDI to XLSForm",
"data": {
"error": "input XML is neither a <var> nor a <varGrp> element"
}
}- The conversion preserves the core question structure but may not retain all DDI metadata
- Generated DDI IDs follow the pattern
V_<name>for variables andVG_<name>for groups - XLSForm
hintfield maps to DDIpreQTxt(pre-question text) - Missing value categories (DDI
missing="Y") are excluded from XLSForm choices - These endpoints are stateless and do not persist data to the database