# Dataset for Project: Font Type and Reading Comprehension – An Empirical Study This file describes the research data associated with the manuscript titled "Font Type and Reading Comprehension – An Empirical Study" **Date created:** December 12, 2025 --- **Location of data collection:** We chose schools in small towns whose students' results in national assessments (such as competence measurements) are close to the national average. We included all students in each class; we did not select based on any criteria. The size of the sample is 651 students in grades 4–7, which consisted of students from 14 different institutions. --- **Version:** 1.0 --- ## Citation If you use this dataset, please cite the original scientific publication and the dataset itself. * **Dataset:** Gombos, P., Látics, B. (2025). Font Type and Reading Comprehension – An Empirical Study [Dataset]. *MATE Institutional Repository*. --- ## Description The fundamental question of this research is what role font choice plays in text comprehension and recall among 4th–7th grade students. The study hypothesizes that a text written in a serif font has a more favorable effect on students’ text comprehension and information recall than one written in a sans-serif font. There is no clear consensus in the literature on this issue. Previous studies have primarily examined the on-screen readability of fonts, typically among adults. This research employed and compared two comprehension tests, one using a serif font (Cambria) and one using a sans-serif font (Candara). A total of 651 4th–7th grade students participated in the study. After reading the text, they completed a comprehension test. Two weeks later, 379 students took recall tests consisting of five questions each. The aim of the test was to assess whether information retrieval was more effective for the serif or sans-serif text. A paired sample t-test was used to compare the comprehension and recall test results. No significant differences were found. The hypotheses were not confirmed: for these students, text written in a serif font did not have a more favorable effect on either text comprehension or information recall. **Keywords:** font type, serif, sans serif, reading comprehension, information recall, students --- ## File Description * **File Name:** `Dataset_Font-type-and-reading-comprehension-An-empirical-study.xlsx` * **Format:** Microsoft Excel (.xlsx) * **Description:** This file contains all the raw data presented in the manuscript. The data is organised into separate sheets according to the separate experiments. --- ## Data Dictionary The Excel file contains the following worksheets, columns and rows. ### Sheet: ` Comprehension test in serif` This sheet contains data for Comprehension test in serif. #### Column A: `Student's serial number` **Number of elements in the sample: ** 651 students **Grades: ** grades 4–7 ** Institutions: ** 14 different institutions, including public and foundation schools. **Sampling: ** simple, non-probability sampling #### Column B: `Score achieved` **Task types: ** Five true-false statements, and five sentences had to be completed based on what was read. **Maximum achievable score: ** 10 points **Correction and evaluation guide: ** Available. Each correct answer was worth 1 point. **Topics of texts: ** introduce the sport of cricket or baseball **Time limit: ** half a class period **Measurement: ** on paper --- #### Type in serif typeface materials:** The PDF includes the original type, can be found in the accompanying PDF files. ** Name of the type in serif typeface:** `The_baseball_serif_font.pdf` and `The_cricket_serif_font.pdf` **Language of the questionnaire:** All texts were presented **in Hungarian**. **Typeface of the texts:** The type using texts set **in a serif typeface** --- ### Sheet: `Comp. test in sans serif` This sheet contains data for Comprehension test in sans serif. #### Column A: ` Student's serial number` **Number of elements in the sample: ** 651 students **Grades: ** grades 4–7 ** Institutions: ** 14 different institutions, including public and foundation schools **Sampling: ** simple, non-probability sampling. #### Column B: `Score achieved` **Task types: ** Five true-false statements, and five sentences had to be completed based on what was read. **Maximum achievable score: ** 10 points **Correction and evaluation guide: ** Available. Each correct answer was worth 1 point. **Topics of texts: ** introduce the sport of cricket or baseball **Time limit: ** half a class period **Measurement: ** on paper --- #### Type in sans serif typeface materials:** The PDF includes the original type, can be found in the accompanying PDF files. ** Name of the type in sans serif typeface:** `The_baseball_sans_serif_font.pdf` and `The_cricket_sans_serif_font.pdf` **Language of the questionnaire:** All texts were presented **in Hungarian**. **Typeface of the texts:** The type using texts set **in a sans serif typeface** --- ### Sheet: `Information recall in serif` This sheet contains data for Information recall in serif. #### Column A: ` Student's serial number` **Number of elements in the sample: ** 379 students **Grades: ** grades 4–7 ** Institutions: ** 14 different institutions, including public and foundation schools. **Sampling: ** simple, non-probability sampling **Comment: ** The decrease in the sample size can be explained by the epidemiological situation at the time (several classes were quarantined). #### Column B: ` Score achieved` **Task types: ** Four multiple-choice tasks, and one sentence had to be completed with the correct information. **Maximum achievable score: ** 5 points **Correction and evaluation guide: ** Available. Each correct answer was worth 1 point. **Time limit: ** 10 minutes **Measurement: ** on paper --- #### Questionnaire materials:** The questionnaire includes the original questions, can be found in the accompanying PDF files. ** Name of the questionnaire:** `Information_recall_test.pdf` **Language of the questionnaire:** All texts and questions in the questionnaire were presented **in Hungarian**. **Typeface of the texts:** The questionnaire measured information recall using texts set **in a serif typeface**. **Maximum achievable score:** 5 points **Measurement mode:** Paper-based administration. --- ### Sheet: `Inf. recall in sans serif` This sheet contains data for Information recall in sans serif typeface**. #### Column A: `Student's serial number` **Number of elements in the sample: ** 379 students **Grades: ** grades 4–7 ** Institutions: ** 14 different institutions, including public and foundation schools **Sampling: ** simple, non-probability sampling **Comment: ** The decrease in the sample size can be explained by the epidemiological situation at the time (several classes were quarantined). #### Column B: `Score achieved` **Task types: ** Four multiple-choice tasks, and one sentence had to be completed with the correct information. **Maximum achievable score: ** 5 points **correction and evaluation guide: ** Available. Each correct answer was worth 1 point. **Time limit: ** 10 minutes **Measurement: ** on paper --- #### Questionnaire materials:** The questionnaire includes the original questions, can be found in the accompanying PDF files. ** Name of the questionnaire:** `Information_recall_test.pdf` **Language of the questionnaire:** All texts and questions in the questionnaire were presented **in Hungarian**. **Typeface of the texts:** The questionnaire measured information recall using texts set **in a sans serif typeface**. **Maximum achievable score:** 5 points **Measurement mode:** Paper-based administration. --- ## Methodology For the research, we used texts of our own creation: equivalent test versions. The two texts were very similar in sentence structure and difficulty level: including the title, both consisted of 201 words and 1275 characters without spaces. In terms of their topics – one introduces the sport of cricket, the other baseball – they were not related to the school curriculum. In terms of their topic, they were not related to the school curriculum, so they measured reading comprehension skills and not subject knowledge. We consider the difficulty level of the texts to be the same. (Unfortunately, due to the lack of a Hungarian-specific readability formula, it was not possible for us to calculate a readability score.) The texts were also uniform from a typographical point of view; we used 1.5 line spacing, automatic hyphenation, and justification. Thus, the texts differed only in their specific topic and font type. Our two chosen font types were Cambria (serif) and Candara (sans-serif). The criteria for selecting the font types were that it should not be the usual Times New Roman – Arial pair, but at the same time, it was important to choose two font types that resemble these two, because these are the ones students encounter most often. Our goal was not to provide a type that differed significantly from the traditional ones (such as script-like or all-caps). Reading comprehension was measured based on pre-prepared test questions; the difficulty of the two tests also did not differ. After reading the text, the students encountered two types of tasks: a closed, multiple-choice task (alternative choice: true-false) and an open, answer-generating task (verbal completion: one or more missing concepts). The indicator of the measuring instrument was the quality of the solution, i.e., the score achieved. The maximum achievable score in both tests was ten points. The measurement was done on paper. --- **Data collection period: ** between September 2020 and April 2024 --- ## Authors and Contributors * Péter Gombos1*#, Barbara Látics2 1 Hungarian University of Agriculture and Life Sciences Institute of Education, Institute of Pedagogy, Kaposvár, Hungary 2 University of Pécs, Faculty of Humanities and Social Sciences, Education and Society Doctoral School of Education, Pécs, Hungary *These authors contributed equally: Péter Gombos, Barbara Látics #Corresponding author: gombos.peter@uni-mate.hu --- ## Funding This work was supported partly by the National Research Development and Innovation Office of Hungary (NKFI K - 135824). --- ## License This dataset is available under the [Creative Commons Attribution 4.0 International (CC BY 4.0)](https://creativecommons.org/licenses/by/4.0/) license. This means you are free to: * **Share** — copy and redistribute the material in any medium or format. * **Adapt** — remix, transform, and build upon the material for any purpose, even commercially. Under the condition that you give appropriate **credit** to the original authors and the source, provide a link to the license, and indicate if changes were made. --- ## Contact For questions regarding the research or the data, please contact the corresponding authors: Péter Gombos, gombos.peter@uni-mate.hu; Barbara Látics, lbarbi0604@gmail.com