Page Regions
On one page there are structurally and contentwise different regions. Once the print space has been viewed or marked in capture level 3 (structure GT level), the layout is analyzed. Different regions have to be distinguished and labelled as such:
- Text region : TextRegion,
- Illustrations, photos : ImageRegion,
- Book decoration, drawings : GraphicRegion,
- Separator lines, separators : SeparatorRegion,
- Tables : TableRegion,
- Line drawing : LineDrawingRegion,
- Maps : MapRegion
- Mathematical formula : MathsRegion,
- Chemical formula : ChemRegion,
- Muscial notes : MusicRegion,
- Advertisement : AdvertRegion und
- Dirt, stains, noise : NoiseRegion
How profoundly a text can be transcribed is determined by the transcription level.
Important: It is not sufficient to identify the entire print space as a TextRegion as
the individually delimitable blocks must be localized. This also makes it possible
to
clearly distinguish between textual and non-textual segments.