Publications by topic: Automatic Pagination
A General LuaTeX Framework for Globally Optimized Pagination (peer reviewed version)
- Frank Mittelbach
- Paper submitted to the Computational Intelligence Journal (Wiley) in 2017, accepted January 2018, published 2019
- Abstract:
Pagination problems deal with questions around transforming a source text stream into a formatted document by dividing it up into individual columns and pages, including adding auxiliary elements that have some relationship to the source stream data but may allow a certain amount of variation in placement (such as figures or footnotes).
Traditionally the pagination problem has been approached by separating it into one of micro-typography (e.g., breaking text into paragraphs, also known as h&j) and one of macro-typography (e.g., taking a galley of already formatted paragraphs and breaking them into columns and pages) without much interaction between the two.
While early solutions for both problem areas used simple greedy algorithms, Knuth and Plass (1981) introduced in the ’80s a global-fit algorithm for line breaking that optimizes the breaks across the whole paragraph. This algorithm was implemented in TeX’82 (see Knuth (986b)) and has since kept its crown as the best available solution for this space. However, for macro-typography there has been no (successful) attempt to provide globally optimized page layout: All systems to date (including TeX) use greedy algorithms for pagination. Various problems in this area have been researched and the literature documents some prototype development. But none of them have been made widely available to the research community or ever made it into a generally usable and publicly available system.
This paper is an extended version of the work by Mittelbach (2016) originally presented at the DocEng ’16 conference in Vienna. It presents a framework for a global-fit algorithm for page breaking based on the ideas of Knuth/Plass. It is implemented in such a way that it is directly usable without additional executables with any modern TeX installation. It therefore can serve as a test bed for future experiments and extensions in this space. At the same time a cleaned-up version of the current prototype has the potential to become a production tool for the huge number of TeX users world-wide.
The paper also discusses two already implemented extensions that increase the flexibility of the pagination process (a necessary prerequisite for successful global optimization): the ability to automatically consider existing flexibility in paragraph length (by considering paragraph variations with different numbers of lines) and the concept of running the columns on a double spread a line long or short. It concludes with a discussion of the overall approach, its inherent limitations and directions for future research.
This article is an extended version (37 pages) of the 2016 ACM article “A General Framework for Globally Optimized Pagination”, providing a lot more details and additional research results.
Legal notice from Wiley
This is the peer reviewed version of the following article: Frank Mittelbach. “A general LuaTeX framework for globally optimized pagination”. Computational Intelligence, 35(2):242–284, 2019, which has been published in final form at https://doi.org/10.1111/coin.12165. This article may be used for non-commercial purposes in accordance with Wiley Terms and Conditions for Use of Self-Archived Versions. This article may not be enhanced, enriched or otherwise transformed into a derivative work, without express permission from Wiley or by statutory rights under applicable legislation. Copyright notices must not be removed, obscured or modified. The article must be linked to Wiley’s version of record on Wiley Online Library and any embedding, framing or otherwise making available the article or pages thereof by third parties from platforms, services and websites other than Wiley Online Library must be prohibited.”
A General LuaTeX Framework for Globally Optimized Pagination (pre-peer reviewed version)
- Frank Mittelbach
- Paper submitted to the Computational Intelligence Journal (Wiley) in 2017, accepted January 2018
- Abstract:
Pagination problems deal with questions around transforming a source text stream into a formatted document by dividing it up into individual columns and pages, including adding auxiliary elements that have some relationship to the source stream data but may allow a certain amount of variation in placement (such as figures or footnotes).
Traditionally the pagination problem has been approached by separating it into one of micro-typography (e.g., breaking text into paragraphs, also known as h&j) and one of macro-typography (e.g., taking a galley of already formatted paragraphs and breaking them into columns and pages) without much interaction between the two.
While early solutions for both problem areas used simple greedy algorithms, Knuth and Plass (1981) introduced in the ’80s a global-fit algorithm for line breaking that optimizes the breaks across the whole paragraph. This algorithm was implemented in TeX’82 (see Knuth (986b)) and has since kept its crown as the best available solution for this space. However, for macro-typography there has been no (successful) attempt to provide globally optimized page layout: All systems to date (including TeX) use greedy algorithms for pagination. Various problems in this area have been researched and the literature documents some prototype development. But none of them have been made widely available to the research community or ever made it into a generally usable and publicly available system.
This paper is an extended version of the work by Mittelbach (2016) originally presented at the DocEng ’16 conference in Vienna. It presents a framework for a global-fit algorithm for page breaking based on the ideas of Knuth/Plass. It is implemented in such a way that it is directly usable without additional executables with any modern TeX installation. It therefore can serve as a test bed for future experiments and extensions in this space. At the same time a cleaned-up version of the current prototype has the potential to become a production tool for the huge number of TeX users world-wide.
The paper also discusses two already implemented extensions that increase the flexibility of the pagination process (a necessary prerequisite for successful global optimization): the ability to automatically consider existing flexibility in paragraph length (by considering paragraph variations with different numbers of lines) and the concept of running the columns on a double spread a line long or short. It concludes with a discussion of the overall approach, its inherent limitations and directions for future research.
This is the pre-peer reviewed version of the article, it will be replaced by the peer reviewed version after the 12 month embargo phase. This article may be used for non-commercial purposes in accordance with Wiley Terms and Conditions for Self-Archiving.
The peer reviewed and published version is now available as A General LuaTeX Framework for Globally Optimized Pagination (journal version).
This article is an extended version (37 pages) of the 2016 ACM article “A General Framework for Globally Optimized Pagination”, providing a lot more details and additional research results.
From the ACM DocEng Conference 2017 (Valletta, Malta)
-
research-article
Effective Floating Strategies
This paper presents an extension to the general framework for globally optimized pagination described in Mittelbach (2016). The extended algorithm supports automatic placement of floats as part of the optimization. It uses a flexible constraint model ...
- Presentation of the paper as given in Malta: Effective Floating Strategies (slides – large 23Mb)
This paper presents an extension to the general framework for globally optimized pagination described in Mittelbach (2016). The extended algorithm supports automatic placement of floats as part of the optimization. It uses a flexible constraint model that allows for the implementation of typical typographic rules that can be weighted against each other to support different application scenarios.
The above link enables free download of the paper from the ACM Digital Library. (Due to ACM restrictions it unfortunately doesn’t work from the “all-publications” page. If you are there please use the one on the pagination topic page instead.)
From the TUG/GUST Conference 2017 (Bachotek, Poland)
Through The Looking Glass — and what Alice found there … (handouts)
- Frank Mittelbach
- TUG/GUST Conference 2017 (Bachotek, Poland)
Continuing the quest for automatically finding optimal pagination of documents the journey takes us now to the fairy land of objective functions, call-out constraints, layout templates and other mystical creatures and a Queen that cries “Faster! Faster!” because “… it takes all the running YOU can do, to keep in the same place. If you want to get somewhere else, you must run at least twice as fast as that!” This talk explores how fast we must ran to enter that world.
Slides of the talk: Through The Looking Glass — and what Alice found there …
From the TUG Conference 2016 (Toronto, Canada)
Alice goes floating (slides with speaker notes intermixed)
- Frank Mittelbach
- TUG Conference 2016 (Toronto, Canada)
In this talk a framework for globally optimizing pagination of documents containing floats is demonstrated. As the main example Alice in Wonderland by Lewis Carroll was chosen. If such a document is formatted using standard LaTeX it will result in a pagination with many issues as demonstrated here. If the same document is formatted using the new framework then one will get a globally optimized solution as shown here. At the moment the framework is still in its early stages and not yet publicly available as further research and development is needed.
Video of the talk recorded by River Valley TV: Alice goes floating (audio near the end fails unfortunately)
From ACM DocEng conference 2016 (Vienna, Austria)
-
research-article
A General Framework for Globally Optimized Pagination
Pagination problems deal with questions around transforming a source text stream into a formatted document by dividing it up into individual columns and pages, including adding auxiliary elements that have some relationship to the source stream data but ...
This paper presents an algorithm for globally optimized pagination using dynamic programming and discusses its theoretical background. It was awarded the “ACM Best Paper Award” at the DocEng 2016 conference. The paper is the basis for the work demonstrated at BachoTek and TUG 2016 (the order is reversed as submission deadline for DocEng was already in March but the conference was in September).
A greatly extended version of this paper (37 pages) titled “A General LuaTeX Framework for Globally Optimized Pagination” was submitted to the Computational Intelligence Journal (Wiley) in 2017 and accepted January 2018.
The above link enables free download of the paper from the ACM Digital Library. (Due to ACM restrictions it unfortunately doesn’t work from the “all-publications” page. If you are there please use the one on the pagination topic page instead.)
Formatting documents with floats – A new algorithm for LaTeX2e
- Frank Mittelbach
- Published paper, 2000
- Keywords: LaTeX3, page makeup, models, concepts, proto-types
At the GUTenberg meeting in Toulouse, Frank presented a paper about a new output routine that is intended to enhance the way LaTeX deals with floating objects in multicolumn environments.
Publications by topic
Under each topic you will find relevant articles and papers on related subjects published by the LaTeX3 project as well as links to videos of their conference presentations.
Publications by year
A alternative view of all publications ordered by year is given on the Publications by Year page.
Books by project members and others
A list of books that we think are useful is given on the Books Page. By buying documentation through this website you support the volunteer work of project members to keep LaTeX useful for you.
- Current LaTeX (LaTeX2e)
- LaTeX -> LaTeX3
- PDF, Tagging, Accessibility
- Coding, Testing & Support
- Other topics independent of the LaTeX version