diff --git a/ARCHITECTURE.md b/ARCHITECTURE.md new file mode 100644 index 000000000..60254ba61 --- /dev/null +++ b/ARCHITECTURE.md @@ -0,0 +1,171 @@ +# Typst Compiler Architecture +Wondering how to contribute or just curious how Typst works? This document +covers the general architecture of Typst's compiler, so you get an understanding +of what's where and how everything fits together. + +The source-to-PDF compilation process of a Typst file proceeds in four phases. + +1. **Parsing:** Turns a source string into a syntax tree. +2. **Evaluation:** Turns a syntax tree and its dependencies into content. +4. **Layout:** Layouts content into frames. +5. **Export:** Turns frames into an output format like PDF or a raster graphic. + +The Typst compiler is _incremental:_ Recompiling a document that was compiled +previously is much faster than compiling from scratch. Most of the hard work is +done by [`comemo`], an incremental compilation framework we have written for +Typst. However, the compiler is still carefully written with incrementality in +mind. Below we discuss the four phases and how incrementality affects each of +them. + + +## Parsing +The syntax tree and parser are located in `src/syntax`. Parsing is a pure +function `&str -> SyntaxNode` without any further dependencies. The result is a +concrete syntax tree reflecting the whole file structure, including whitespace +and comments. Parsing cannot fail. If there are syntactic errors, the returned +syntax tree contains error nodes instead. It's important that the parser deals +well with broken code because it is also used for syntax highlighting and IDE +functionality. + +**Typedness:** +The syntax tree is untyped, any node can have any `SyntaxKind`. This makes it +very easy to (a) attach spans to each node (see below), (b) traverse the tree +when doing highlighting or IDE analyses (no extra complications like a visitor +pattern). The `typst::syntax::ast` module provides a typed API on top of +the raw tree. This API resembles a more classical AST and is used by the +interpreter. + +**Spans:** +After parsing, the syntax tree is numbered with _span numbers._ These numbers +are unique identifiers for syntax nodes that are used to trace back errors in +later compilation phases to a piece of syntax. The span numbers are ordered so +that the node corresponding to a number can be found quickly. + +**Incremental:** +Typst has an incremental parser that can reparse a segment of markup or a +code/content block. After incremental parsing, span numbers are reassigned +locally. This way, span numbers further away from an edit stay mostly stable. +This is important because they are used pervasively throughout the compiler, +also as input to memoized functions. The less they change, the better for +incremental compilation. + + +## Evaluation +The evaluation phase lives in `src/eval`. It takes a parsed `Source` file and +evaluates it to a `Module`. A module consists of the `Content` that was written +in it and a `Scope` with the bindings that were defined within it. + +A source file may depend on other files (imported sources, images, data files), +which need to be resolved. Since Typst is deployed in different environments +(CLI, web app, etc.) these system dependencies are resolved through a general +interface called a `World`. Apart from files, the world also provides +configuration and fonts. + +**Interpreter:** +Typst implements a tree-walking interpreter. To evaluate a piece of source, you +first create a `Vm` with a scope stack. Then, the AST is recursively evaluated +through trait impls of the form `fn eval(&self, vm: &mut Vm) -> Result`. +An interesting detail is how closures are dealt with: When the interpreter sees +a closure / function definition, it walks the body of the closure and finds all +accesses to variables that aren't defined within the closure. It then clones the +values of all these variables (it _captures_ them) and stores them alongside the +closure's syntactical definition in a closure value. When the closure is called, +a fresh `Vm` is created and its scope stack is initialized with the captured +variables. + +**Incremental:** +In this phase, incremental compilation happens at the granularity of the module +and the closure. Typst memoizes the result of evaluating a source file across +compilations. Furthermore, it memoizes the result of calling a closure with a +certain set of parameters. This is possible because Typst ensures that all +functions are pure. The result of a closure call can be recycled if the closure +has the same syntax and captures, even if the closure values stems from a +different module evaluation (i.e. if a module is reevaluated, previous calls to +closures defined in the module can still be reused). + + +## Layout +The layout phase takes `Content` and produces one `Frame` per page for it. To +layout `Content`, we first have to _realize_ it by applying all relevant show +rules to the content. Since show rules may be defined as Typst closures, +realization can trigger closure evaluation, which in turn produces content that +is recursively realized. Realization is a shallow process: While collecting list +items into a list that we want to layout, we don't realize the content within +the list items just yet. This only happens lazily once the list items are +layouted. + +When we a have realized the content into a layoutable +node, we can then layout it into _regions,_ which describe the space into which +the content shall be layouted. Within these, a node is free to layout itself +as it sees fit, returning one `Frame` per region it wants to occupy. + +**Introspection:** +How content layouts (and realizes) may depend on how _it itself_ is layouted +(e.g., through page numbers in the table of contents, counters, state, etc.). +Typst resolves these inherently cyclical dependencies through the _introspection +loop:_ The layout phase runs in a loop until the results stabilize. Most +introspections stabilize after one or two iterations. However, some may never +stabilize, so we give up after five attempts. + +**Incremental:** +Layout caching happens at the granularity of a node. This is important because +overall layout is the most expensive compilation phase, so we want to reuse as +much as possible. + + +## Export +Exporters live in `src/export`. They turn layouted frames into an output file +format. + +- The PDF exporter takes layouted frames and turns them into a PDF file. +- The built-in renderer takes a frame and turns it into a pixel buffer. +- HTML export does not exist yet, but will in the future. However, this requires + some complex compiler work because the export will start with `Content` + instead of `Frames` (layout is the browser's job). + + +## IDE +The `src/ide` module implements IDE functionality for Typst. It builds heavily +on the other modules (most importantly, `syntax` and `eval`). + +**Syntactic:** +Basic IDE functionality is based on a file's syntax. However, the standard +syntax node is a bit too limited for writing IDE tooling. It doesn't provide +access to its parents or neighbours. This is a fine for an evaluation-like +recursive traversal, but impractical for IDE use cases. For this reason, there +is an additional abstraction on top of a syntax node called a `LinkedNode`, +which is used pervasively across the `ide` module. + +**Semantic:** +More advanced functionality like autocompletion requires semantic analysis of +the source. To gain semantic information for things like hover tooltips, we +directly use other parts of the compiler. For instance, to find out the type of +a variable, we evaluate and realize the full document equipped with a `Tracer` +that emits the variable's value whenever it is visited. From the set of +resulting values, we can then compute the set of types a value takes on. Thanks +to incremental compilation, we can recycle large parts of the compilation that +we had to do anyway to typeset the document. + +**Incremental:** +Syntactic IDE stuff is relatively cheap for now, so there are no special +incrementality concerns. Semantic analysis with a tracer is relatively +expensive. However, large parts of a traced analysis compilation can reuse +memoized results from a previous normal compilation. Only the module evaluation +of the active file and layout code that somewhere within evaluates source code +in the active file needs to re-run. This is all handled automatically by +`comemo` because the tracer is wrapped in a `comemo::TrackedMut` container. + + +## Tests +Typst has an extensive suite of integration tests. A test file consists of +multiple tests that are separated by `---`. For each test file, we store a +reference image defining what the compiler _should_ output. To manage the +reference images, you can use the VS code extension in `tools/test-helper`. + +The integration tests cover parsing, evaluation, realization, layout and +rendering. PDF output is sadly untested, but most bugs are in earlier phases of +the compiler; the PDF output itself is relatively straight-forward. IDE +functionality is also mostly untested. PDF and IDE testing should be added in +the future. + +[`comemo`]: https://github.com/typst/comemo/ diff --git a/README.md b/README.md index b29534a62..73ad70027 100644 --- a/README.md +++ b/README.md @@ -35,7 +35,7 @@ currently in public beta. ## Example This is what a Typst file with a bit of math and automation looks like:

- Example + Example

Let's disect what's going on: @@ -165,13 +165,13 @@ instant preview. To achieve these goals, we follow three core design principles: Luckily we have [`comemo`], a system for incremental compilation which does most of the hard work in the background. -[docs]: https://typst.app/docs +[docs]: https://typst.app/docs/ [app]: https://typst.app/ [discord]: https://discord.gg/2uDybryKPe [show]: https://typst.app/docs/reference/styling/#show-rules [math]: https://typst.app/docs/reference/math/ [scripting]: https://typst.app/docs/reference/scripting/ -[rust]: https://rustup.rs -[releases]: https://github.com/typst/typst/releases +[rust]: https://rustup.rs/ +[releases]: https://github.com/typst/typst/releases/ [architecture]: https://github.com/typst/typst/blob/main/ARCHITECTURE.md [`comemo`]: https://github.com/typst/comemo/ diff --git a/library/src/layout/mod.rs b/library/src/layout/mod.rs index b29da700e..4a38acb66 100644 --- a/library/src/layout/mod.rs +++ b/library/src/layout/mod.rs @@ -380,8 +380,11 @@ impl<'a, 'v, 't> Builder<'a, 'v, 't> { let Some(doc) = &mut self.doc else { return Ok(()) }; if !self.flow.0.is_empty() || (doc.keep_next && styles.is_some()) { let (flow, shared) = mem::take(&mut self.flow).0.finish(); - let styles = - if shared == StyleChain::default() { styles.unwrap() } else { shared }; + let styles = if shared == StyleChain::default() { + styles.unwrap_or_default() + } else { + shared + }; let page = PageNode::new(FlowNode::new(flow.to_vec()).pack()).pack(); let stored = self.scratch.content.alloc(page); self.accept(stored, styles)?; diff --git a/library/src/meta/query.rs b/library/src/meta/query.rs index bab8ed7c0..86419be0e 100644 --- a/library/src/meta/query.rs +++ b/library/src/meta/query.rs @@ -39,14 +39,14 @@ cast_from_value! { /// Display: Query /// Category: special #[node(Locatable, Show)] -pub struct QueryNode { +struct QueryNode { /// The thing to search for. #[required] - pub target: Selector, + target: Selector, /// The function to format the results with. #[required] - pub format: Func, + format: Func, } impl Show for QueryNode { @@ -58,7 +58,6 @@ impl Show for QueryNode { let id = self.0.stable_id().unwrap(); let target = self.target(); let (before, after) = vt.introspector.query_split(target, id); - let func = self.format(); - Ok(func.call_vt(vt, [before.into(), after.into()])?.display()) + Ok(self.format().call_vt(vt, [before.into(), after.into()])?.display()) } } diff --git a/src/eval/array.rs b/src/eval/array.rs index e42fd28da..fa71ff1a5 100644 --- a/src/eval/array.rs +++ b/src/eval/array.rs @@ -5,7 +5,7 @@ use std::ops::{Add, AddAssign}; use ecow::{eco_format, EcoString, EcoVec}; use super::{ops, Args, Func, Value, Vm}; -use crate::diag::{bail, At, SourceResult, StrResult}; +use crate::diag::{At, SourceResult, StrResult}; use crate::util::pretty_array_like; /// Create a new [`Array`] from values. @@ -139,9 +139,6 @@ impl Array { /// Return the first matching element. pub fn find(&self, vm: &mut Vm, func: Func) -> SourceResult> { - if func.argc().map_or(false, |count| count != 1) { - bail!(func.span(), "function must have exactly one parameter"); - } for item in self.iter() { let args = Args::new(func.span(), [item.clone()]); if func.call_vm(vm, args)?.cast::().at(func.span())? { @@ -153,9 +150,6 @@ impl Array { /// Return the index of the first matching element. pub fn position(&self, vm: &mut Vm, func: Func) -> SourceResult> { - if func.argc().map_or(false, |count| count != 1) { - bail!(func.span(), "function must have exactly one parameter"); - } for (i, item) in self.iter().enumerate() { let args = Args::new(func.span(), [item.clone()]); if func.call_vm(vm, args)?.cast::().at(func.span())? { @@ -169,9 +163,6 @@ impl Array { /// Return a new array with only those elements for which the function /// returns true. pub fn filter(&self, vm: &mut Vm, func: Func) -> SourceResult { - if func.argc().map_or(false, |count| count != 1) { - bail!(func.span(), "function must have exactly one parameter"); - } let mut kept = EcoVec::new(); for item in self.iter() { let args = Args::new(func.span(), [item.clone()]); @@ -184,9 +175,6 @@ impl Array { /// Transform each item in the array with a function. pub fn map(&self, vm: &mut Vm, func: Func) -> SourceResult { - if func.argc().map_or(false, |count| !(1..=2).contains(&count)) { - bail!(func.span(), "function must have one or two parameters"); - } let enumerate = func.argc() == Some(2); self.iter() .enumerate() @@ -203,9 +191,6 @@ impl Array { /// Fold all of the array's elements into one with a function. pub fn fold(&self, vm: &mut Vm, init: Value, func: Func) -> SourceResult { - if func.argc().map_or(false, |count| count != 2) { - bail!(func.span(), "function must have exactly two parameters"); - } let mut acc = init; for item in self.iter() { let args = Args::new(func.span(), [acc, item.clone()]); @@ -216,9 +201,6 @@ impl Array { /// Whether any element matches. pub fn any(&self, vm: &mut Vm, func: Func) -> SourceResult { - if func.argc().map_or(false, |count| count != 1) { - bail!(func.span(), "function must have exactly one parameter"); - } for item in self.iter() { let args = Args::new(func.span(), [item.clone()]); if func.call_vm(vm, args)?.cast::().at(func.span())? { @@ -231,9 +213,6 @@ impl Array { /// Whether all elements match. pub fn all(&self, vm: &mut Vm, func: Func) -> SourceResult { - if func.argc().map_or(false, |count| count != 1) { - bail!(func.span(), "function must have exactly one parameter"); - } for item in self.iter() { let args = Args::new(func.span(), [item.clone()]); if !func.call_vm(vm, args)?.cast::().at(func.span())? { diff --git a/src/model/styles.rs b/src/model/styles.rs index 8cccb5f67..7b725af9a 100644 --- a/src/model/styles.rs +++ b/src/model/styles.rs @@ -343,12 +343,7 @@ impl Debug for Transform { cast_from_value! { Transform, content: Content => Self::Content(content), - func: Func => { - if func.argc().map_or(false, |count| count != 1) { - Err("function must have exactly one parameter")? - } - Self::Func(func) - }, + func: Func => Self::Func(func), } /// A chain of style maps, similar to a linked list. @@ -494,6 +489,15 @@ impl<'a> StyleChain<'a> { }) } + /// Convert to a style map. + pub fn to_map(self) -> StyleMap { + let mut suffix = StyleMap::new(); + for link in self.links() { + suffix.0.splice(0..0, link.iter().cloned()); + } + suffix + } + /// Iterate over the entries of the chain. fn entries(self) -> Entries<'a> { Entries { inner: [].as_slice().iter(), links: self.links() } diff --git a/tests/typ/compiler/array.typ b/tests/typ/compiler/array.typ index b51ee759b..b9e5517ec 100644 --- a/tests/typ/compiler/array.typ +++ b/tests/typ/compiler/array.typ @@ -163,7 +163,7 @@ #test((1, 2, 3, 4).fold(0, (s, x) => s + x), 10) --- -// Error: 20-30 function must have exactly two parameters +// Error: 20-22 unexpected argument #(1, 2, 3).fold(0, () => none) ---