Feeds
Russ Allbery: Review: Dark Horse
Review: Dark Horse, by Michelle Diener
Series: Class 5 #1 Publisher: Eclipse Copyright: June 2015 ISBN: 0-9924559-3-6 Format: Kindle Pages: 366Dark Horse is a science fiction romance novel, the first of a five book series as of this writing. It is self-published, although it is sufficiently well-edited and packaged that I had to do some searching to confirm that.
Rose was abducted by aliens. The Tecrans picked her up along with a selection of Earth animals, kept her in a cell in their starship, and experimented on her. As the book opens, she has managed to make her escape with the aid of an AI named Sazo who was also imprisoned on the Tecran ship. Sazo dealt with the Tecrans, dropped the ship in the middle of Grih territory, and then got Rose and most of the animals on shuttles to a nearby planet.
Dav Jallan is the commander of the ship the Grih sent to investigate the unexplained appearance of a Class 5 Tecran warship in the middle of their territory. The Grih and the Tecran, along with three other species, are members of the United Council, which means in theory they're all at peace. With the Tecran, that theory is often strained. Dav is not going to turn down one of their highly-advanced Class 5 warships delivered to him on a silver platter. There is only the matter of the unexpected cargo, the first orange dots (indicating unknown life forms) that most of the Grih have ever seen.
There is a romance. That romance did not work for me. I thought it was highly unprofessional on Dav's part and a bit too obviously constructed on the author's part. It also leans on the subgenre convention that aliens can be remarkably physically similar and sexually compatible, which always causes problems for my suspension of disbelief even though I know it's no less plausible than faster-than-light travel.
Despite that, I had so much fun with this book! It was absolutely delightful and weirdly grabby in a way that caught me by surprise. I was skimming some parts of it to write this review and found myself re-reading multiple pages before I dragged myself back on task.
I think the most charming part of this book is that the United Council has a law called the Sentient Beings Agreement that makes what the Tecran were doing extremely illegal, and the Grih and the other non-Tecran aliens take this very seriously and with a refreshing lack of cynicism. Rose has a typical human reaction to ending up in a place where she doesn't know the rules and isn't entirely an expected guest. She almost reflexively smoothes over miscommunications and tensions, trying to adapt to their expectations. And then, repeatedly, the Grih realize how much work she's doing to adapt to them, feel enraged at the Tecran and upset that they didn't understand or properly explain something, and find some way to make Rose feel more comfortable. It's surprisingly soothing and comforting to read.
It occurred to me in several places that Dark Horse could be read as a wish-fulfillment fantasy of what life as a woman could be like if men took their fair share of the mental load. (This concept is usually applied to housework, but I think it generalizes to other social and communication contexts.) I suspect this was not an accident.
There is a lot of wish fulfillment in this book. The Grih are very human-like but hunky, which is convenient for the romance subplot. They struggle to sing, value music exceptionally highly, and consider Rose's speaking voice beautifully musical. Her typical human habit of singing to herself is a source of immediate and almost overwhelming fascination. The supplies Rose takes from the Tecran ship when she flees just happen to be absurdly expensive scented shampoo and equally expensive luxury adaptable clothing. The world she lands on, and the Grih ship, are low-gravity compared to Earth, so Rose is unusually strong for her size. Grih military camouflage has no effect on her human vision. The book is set up to make Rose special.
If that type of wish fulfillment is going to grate, wait on this book until you're more in the mood for it. But I like wish fulfillment books when they're done well. Part of why I like to read is to imagine a better world. And Rose isn't doted on; despite their hospitality, she's constantly underestimated by the Grih. Even with their deep belief in the Sentient Beings Agreement, the they find it hard to believe that an unknown sentient, even an advanced sentient, is really their equal. Their concern at the start is somewhat patronizing, so watching Rose constantly surprise them delighted the part of my brain that likes both competence porn and deserved reversals, even though the competence here is often due to accidents of biology. It helps that Diener tells the story in alternating perspectives, so the reader first watches Rose do something practical and straightforward from her perspective and then gets to enjoy the profound surprise and chagrin of the aliens.
There is a plot beneath this first contact story, and beyond the political problem of figuring out what to do with Rose and the Tecran. Sazo, Rose's AI friend, does not want the Grih to know he exists. He has a history that Rose does not know about and may not be entirely safe. As the political situation with the Tecran escalates, Sazo is pursuing goals of his own, and Rose has a firm opinion about where her loyalties should lie. The resolution is nothing ground-breaking as far as SF goes, but I thought it was satisfyingly tense and complex. Dark Horse leaves obvious room for a sequel, but it comes to a satisfying conclusion.
The writing is serviceable, particularly once you get into the story. I would not call it great, and it's not going to win any literary awards, but it didn't interfere with my enjoyment of the story.
This is not the sort of book that will make anyone's award list, but it is easily in the top five of books I had the most fun reading this year. Maybe save it for when you're looking for something light and wholesome and don't mind some rather obvious tropes, but if you're in the mood for imagining people who take laws seriously and sincerely try to help other people, I found this an utterly delightful way to pass the time. I immediately bought the sequel. Recommended.
Followed by Dark Deeds.
Rating: 8 out of 10
Armin Ronacher: MiniJinja: Learnings from Building a Template Engine in Rust
Given that I can't stop creating template engines, I figured I might write a bit about my learnings of creating MiniJinja which is an implementation of my Jinja2 template engine for Rust. Disclaimer: this post might be a bit more technical.
There is a good chance you have come across Jinja2 templates before as they became quite common place in various places over the years. They look a bit like this:
{% extends "layout.html" %} {% block body %} <p>Hello {{ name }}!</p> {% endblock %}If you want to play around it yourself, here are some links:
- The MiniJinja playground lets you play with a WASM compiled version of MiniJinja.
- The API Documentation documents all APIs, functionality and syntax.
- The GitHub Project for all the code including lots of examples.
- minijinja and minijinja-cli on crates.io
Maybe we start with the initial question of why I wrote MiniJinja. It's the year 2024 and people don't create a ton of HTML with server side rendered template engines any more. While there is some resurgence of that model thanks to HTMX, hotwire and livewire, I personally use SolidJS for my internal UI needs. There is however always a need to generate some form of text and so somehow Jinja2's need never really went away. When I originally created it, it was clearly meant for generating HTML with some JavaScript sprinkled on top, but in the years since I have encountered Jinja templates in many more places, primarily for generating YAML and similar formats. Lately it comes up for LLM prompt generation.
My personal need for MiniJinja came out of an experiment I built for infrastructure automation. Since the templates had to be loaded dynamically I could not use a system like Askama. Askama has type-safe templates that just generate Rust code. On the other hand most Jinja inspired template engines that are dynamic in Rust really do not try very hard to be Jinja compatible. Because writing template engines is also fun, I figured I might give it another try.
Over the last two years I kept adding to the engine until it got to the point where it's at almost feature parity with Jinja2 and quite enjoyable to use.
Runtime ValuesWhen building a template engine for Rust you end up building a little dynamic programming language that is optimized for text generation. Consequently you pull in most of the challenges of building a dynamic language. Particularly when working in Rust the immediate challenge is memory management and exposing native Rust objects to the embedded language. So the interesting bit here is how to create a system that allows interactions between the template engine and the Rust world around it.
MiniJinja, unlike Jinja2 does not use code generation but has a basic stack based VM and a AST based bytecode compiler. Since MiniJinja follows Jinja2 it inherits a lot of the realities of the underlying object system that Jinja2 inherits from Python. For instance macros (functions) are first class objects and they can have closures. This has challenges because it's easy to create cycles and Rust has no garbage collector that can help with this problem.
The core object model in MiniJinja is a Value type which is represented by an enum that looks as follows (some less important variants removed):
#[derive(Clone)] pub struct Value(ValueRepr); #[derive(Clone)] pub(crate) enum ValueRepr { Undefined, None, Bool(bool), U64(u64), I64(i64), F64(f64), String(Arc<str>, StringType), SmallStr(SmallStr), Invalid(Arc<Error>), Object(DynObject), }Externaly everything is a Value. If you Clone it, you usually bump a reference count or you make a cheap memcopy. Values are either primitives such as strings, numbers etc. or objects.
For objects MiniJinja provides a tait called Object which can be implemented by most Rust types. The engine provides a DynObject wrapper is a fancy Arc<dyn Object> which supports borrowing and object safety. I wrote about this before. What you will notice is that quite a few of the types involved have an Arc. That's because these values are for the most part reference counted. Since values here are really fat (they are 24 bytes in memory) a SmallStr type is used to hold up to 22 bytes of string data inline. One byte is used to encode the length of the string, and another byte is then used by the ValueRepr to mark which enum variant is in use. In pure theory this is all wrong. We never use weak references, so the weak count in the Arc is not used and clever bit hackery could be used to greatly reduce the size of the value type. I think one could get the whole thing down to 16 bytes trivially or even 8 bytes with NaN tagging. However I did not want to walk into the world of unsafe code more than feels appropriate.
MiniJinjia is also plenty fast.
One variant that is worth calling out is Invalid. That's a value that can exist in the system but it carries an error. When you're trying to interact with it in most cases it will propagate this error. That's used in the engine in places where the API assumes infallability (particularly during iteration) but it needs a way to emit an error. This concept is quite common when writing an engine in C though typically the actual error is carried out of bounds. For instance in QuickJS there is a marker value that indicates a failure, but the actual error is held on the interpreter runtime.
The trait definition for objects looks like this:
pub trait Object: Debug + Send + Sync { fn repr(self: &Arc<Self>) -> ObjectRepr { ... } fn get_value(self: &Arc<Self>, key: &Value) -> Option<Value> { ... } fn enumerate(self: &Arc<Self>) -> Enumerator { ... } fn enumerator_len(self: &Arc<Self>) -> Option<usize> { ... } fn is_true(self: &Arc<Self>) -> bool { ... } fn call( self: &Arc<Self>, state: &State<'_, '_>, args: &[Value], ) -> Result<Value, Error> { ... } fn call_method( self: &Arc<Self>, state: &State<'_, '_>, method: &str, args: &[Value], ) -> Result<Value, Error> { ... } fn render(self: &Arc<Self>, f: &mut Formatter<'_>) -> Result where Self: Sized + 'static { ... } }Some of these methods are implemented automatically. For instance many of the methods such as is_true or enumerator_len have a default implementation that is based on object repr and the return value from enumerate. But they can be overridden to change the default behavior or to add some potential optimizations.
One of the most important types in Jinja is a map as it holds the template context. They are implemented as you can imagine as Object. The implementation is in fact pretty trivial:
impl<V> Object for BTreeMap<Value, V> where V: Into<Value> + Clone + Send + Sync + fmt::Debug + 'static, { fn get_value(self: &Arc<Self>, key: &Value) -> Option<Value> { self.get(key).cloned().map(|v| v.into()) } fn enumerate(self: &Arc<Self>) -> Enumerator { self.mapped_enumerator(|this| Box::new(this.keys().cloned())) } }This reveals two interesting aspects of the object model: First that Value implements Hash. That means any value can be used as the key in a value. While this is untypical for Rust and even not what happens in Python, it simplifies the system greatly. When in the template engine you write {{ object.key }}, behind the scenes object.get_value(Value::from("key")) is called. Since most keys are typically less than 22 characters, creating a dummy Value wrapper around is not too problematic.
The second and probably more interesting part here is that you can sort of borrow out of an object for the enumerator. The mapped_enumerator helper takes a reference to self and invokes a closure which itself can borrow from self. This adjacent borrowing is implemented with unsafe code as there is no other way to make it work. The combination of repr (defaults to Map), get_value and enumerate gives the object the behavior, shape and contents.
Vectors look quite similar:
impl<T> Object for Vec<T> where T: Into<Value> + Clone + Send + Sync + fmt::Debug + 'static, { fn repr(self: &Arc<Self>) -> ObjectRepr { ObjectRepr::Seq } fn get_value(self: &Arc<Self>, key: &Value) -> Option<Value> { self.get(key.as_usize()?).cloned().map(|v| v.into()) } fn enumerate(self: &Arc<Self>) -> Enumerator { Enumerator::Seq(self.len()) } } Enumerators and Object BehaviorsEnumeration in MiniJinja is a way to allow an object to describe what's inside of it. In combination with the return values from repr() the engine changes how iteration is performed. These are possible enumerators:
pub enum Enumerator { NonEnumerable, Empty, Iter(Box<dyn Iterator<Item = Value> + Send + Sync>), Seq(usize), Values(Vec<Value>), }It's probably easier to explain how enumerators turn into iterators by showing you the try_iter method in the engine:
impl DynObject { fn try_iter(self: &Self) -> Option<Box<dyn Iterator<Item = Value> + Send + Sync>> where Self: 'static, { match self.enumerate() { Enumerator::NonEnumerable => None, Enumerator::Empty => Some(Box::new(None::<Value>.into_iter())), Enumerator::Seq(l) => { let self_clone = self.clone(); Some(Box::new((0..l).map(move |idx| { self_clone.get_value(&Value::from(idx)).unwrap_or_default() }))) } Enumerator::Iter(iter) => Some(iter), Enumerator::Values(v) => Some(Box::new(v.into_iter())), } } }Some of the trivial enumerators are quick to explain: Enumerator::NonEnumerable just does not support iteration and Enumerator::Empty does but won't yield any values. The more interesting one is Enumerator::Seq(n) which basically tells the engine to call get_value from 0 to n to yield items from the object. This is how sequences are implemented. The rest are enumerators that just directly yield values.
So when you want to iterate over a map, you will usually use something like Enumerator::Iter and iterate over all the keys in the map.
The engine then uses ObjectRepr to figure out what to do with it. For a value marked as ObjectRepr::Seq it will display like a sequence, you can index it with integers, and that it iterates over the values in the sequence. If the repr is ObejctRepr::Map then the expectation is that it will be indexable by key and it will iterate over the keys when used in a loop. Its default rendering also is a key-value pair list wrapped in curly braces.
Now quite frankly I don't like that iteration protocol. I think it's more sensible for maps to naturally iterate over the key-value pairs, but since MiniJinja follows Jinja2 and Jinja2 follows Python emulating was important.
Enumerators are a bit different than iterators because they might only define how iteration is performed (see: Enumerator::Seq). To actually create an iterator, the object is then passed to it. They are also asked to provide a length. When an enumerator provides a length it's an indication to the engine that the object can be iterated over more than once (you can re-create the enumerator). This is why objects land in a MiniJinja template that looks like a list, but is actually just an iterable object with a known length. For this MiniJinja uses a trick where it will inspect the size hint of the iterator to make assumptions about it. Internally every enumerator allows the engine to query the length of it:
impl Enumerator { fn query_len(&self) -> Option<usize> { Some(match self { Enumerator::Empty => 0, Enumerator::Values(v) => v.len(), Enumerator::Iter(i) => match i.size_hint() { (a, Some(b)) if a == b => a, _ => return None, }, Enumerator::RevIter(i) => match i.size_hint() { (a, Some(b)) if a == b => a, _ => return None, }, Enumerator::Seq(v) => *v, Enumerator::NonEnumerable => return None, }) } }The important part here is the call to size_hint. If the upper bound is known, and the lower bound matches the upper bound then MiniJinja will assume the iterator will always have that length (for as long as not iterated). As a result it will change the way the object is interacted with. This for instance means that if you run range(10) in a template it looks like a list when printed even though iteration and number creation is lazy. On the other hand if you use the Value::make_one_shot_iterator API the length hint will always be disabled and MiniJinja will not attempt to interact with the iterator when printing it:
{{ range(4) }} -> prints [0, 1, 2, 3] {{ a_real_iterator }} -> prints <iterator> Building a VMLexing and parsing I think is not too puzzling in Rust, but making an AST and making a VM is kinda unusual. The first thing is that Rust is just not particularly amazing at tree structures. In MiniJinja I really wanted to avoid having the AST at all, but it does come in in handy to implement some of the functionality that Jinja2 requires. For instance to establish closures it will just walk the AST to figure out which names are looked up within a function. I tried a few things to improve how memory allocations work with the AST. There are great crates out there for doing this, but I really wanted MiniJinja to be light on dependencies so I ended up opting against all of them.
For the AST design I went with large enums that hold Spanned<T> values:
pub enum Expr<'a> { Var(Spanned<Var<'a>>), Const(Spanned<Const>), ... } pub struct Var<'a> { pub id: &'a str, } pub struct Const { pub value: Value, }You might now be curious what Spanned<T> is. It's a wrapper type that does two things: it boxes the inner node and it stores and adjacent Span which is basically the code location in the original input template for debugging:
pub struct Spanned<T> { node: Box<T>, span: Span, }It implements Deref like a smart pointer so you can poke right through it to interact with the node. The code generator just walks the AST and emits instructions for it.
The instructions themselves are a large enum but the number of arguments to the variants is kept rather low to not waste too much memory. The base size of the instruction is dominated by it being able to hold a Value which as we have established is a pretty hefty thing:
pub enum Instruction<'source> { EmitRaw(&'source str), StoreLocal(&'source str), Lookup(&'source str), LoadConst(Value), Jump(usize), JumpIfFalse(usize), JumpIfFalseOrPop(usize), JumpIfTrueOrPop(usize), ... }The VM keeps most of the runtime state on a State object that is passed to a few places. For instance you have already seen this in the call signature further up. The state for instance holds the loaded instructions or the template context. The VM itself maintains a stack of values and then just steps through a list of instructions on the state in a loop. Since there are a lot of instructions you can have a look on GitHub to see it in its entirety. Here however is a small part that shows roughly how this works:
let mut pc = 0; loop { let instr = state.instructions.get(pc) { Some(instr) => instr, None => break, }; let a; let b; match instr { Instruction::EmitRaw(val) => { out.write_str(val).map_err(Error::from)?; } Instruction::Emit => { self.env.format(&stack.pop(), state, out)?; } Instruction::StoreLocal(name) => { state.ctx.store(name, stack.pop()); } Instruction::Lookup(name) => { stack.push(assert_valid!(state .lookup(name) .unwrap_or(Value::UNDEFINED))); } Instruction::GetAttr(name) => { a = stack.pop(); stack.push(match a.get_attr_fast(name) { Some(value) => value, None => undefined_behavior.handle_undefined(a.is_undefined())?, }); } Instruction::LoadConst(value) => { stack.push(value.clone()); } Instruction::Jump(jump_target) => { pc = *jump_target; continue; } Instruction::JumpIfFalse(jump_target) => { a = stack.pop(); if !undefined_behavior.is_true(&a)? { pc = *jump_target; continue; } } // ... } pc += 1; }Basically the current instruction is held in pc (short for program counter), normally it's advanced by one but jump instructions can change the pc to any other location. If you run out of instructions the evaluation ends.
One piece of complexity in the VM comes down to macros. That's because lifetimes make that really tricky. A macro is just a Value that holds a Macro Object internally. So how can that macro reference the instructions, if the instructions themselves have a lifetime to the template 'source? The answer is that they can't (at least I have not found a reasonable way). So instead a macro has an ID which acts as a handle to look up the instructions dynamically from the execution state. Additionally each state has a unique ID so the engine can assert that nothing funny was happening. The downside of this is that a macro cannot be "returned" from a template. They can however be imported from one template into another.
Here is what a macro object looks like in code (abbreviated):
pub(crate) struct Macro { pub name: Value, pub arg_spec: Vec<Value>, pub macro_ref_id: usize, // id of the macro pub state_id: isize, pub closure: Value, pub caller_reference: bool, } impl Object for Macro { fn call(self: &Arc<Self>, state: &State<'_, '_>, args: &[Value]) -> Result<Value, Error> { // we can only call macros that point to loaded template state. // if a template would be returned from a template this will // fail. if state.id != self.state_id { return Err(Error::new( ErrorKind::InvalidOperation, "cannot call this macro. template state went away.", )); } // ... argument parsing let arg_values = ...; // find referenced instructions let (instructions, offset) = &state.macros[self.macro_ref_id]; // created a nested vm and evaluate the macro let vm = Vm::new(state.env()); let mut rv = String::new(); let mut out = Output::with_string(&mut rv); let closure = self.closure.clone(); ok!(vm.eval_macro( instructions, *offset, self.closure.clone(), state.ctx.clone_base(), caller, &mut out, state, arg_values )); // return rendered template as string from the call Ok(if !matches!(state.auto_escape(), AutoEscape::None) { Value::from_safe_string(rv) } else { Value::from(rv) }) } }Additionally the closure is a good source of cycles. For that reason the engine keeps track of all closures during the execution and breaks cycles caused by closures manually by clearning them out.
Cool APIsThe last part that I want to go over is the magic that makes this work:
fn slugify(value: String) -> String { value.to_lowercase().split_whitespace().collect::<Vec<_>>().join("-") } fn timeformat(state: &State, ts: f64) -> String { let configured_format = state.lookup("TIME_FORMAT"); let format = configured_format .as_ref() .and_then(|x| x.as_str()) .unwrap_or("HH:MM:SS"); format_unix_timestamp(ts, format) } let mut env = Environment::new(); env.add_filter("slugify", slugify); env.add_filter("timeformat", timeformat);You might have seem something like this in Rust before, but it's still a bit magical. How can you make functions with seemingly different signatures register with the add_filter function? How does the engine perform the type conversions (as we know the engine has Value types, so where does the String conversion take place?). This is a topic for a blog post on its own but the answer behind this lies in a a lot of clever trait hackery. The add_filter function reveals a bit of that hackery:
pub fn add_filter<N, F, Rv, Args>(&mut self, name: N, f: F) where N: Into<Cow<'source, str>>, F: Filter<Rv, Args> + for<'a> Filter<Rv, <Args as FunctionArgs<'a>>::Output>, Rv: FunctionResult, Args: for<'a> FunctionArgs<'a>, { let filter = BoxedFilter(Arc::new(move |state, args| -> Result<Value, Error> { f.apply_to(Args::from_values(Some(state), args)?).into_result() })); self.filters.insert(name.into(), filter); }Hidden behind this rather complex set of traits are some basic ideas:
- FunctionArgs is a helper trait for type conversions. It's implemented for tuples of different sizes made of ArgType values. These tuples represent the signature of the function. It has a method called from_values which performs that conversion via ArgType.
- ArgType which you can't really see in the code above, is a trait that knows how to convert a Value into whatever the function desires as argument.
- Filter is a trait implemented for function with qualifying FunctionArgs signatures returning a FunctionResult.
- A FunctionResult is a trait that represents potential return values from the function such as a Value, something that can be converted into a Value or a Result.
- The BoxedFilter type is what converts the passed closure into a reference counted object that is held in the environment.
I think a lot of the patterns in MiniJinja are useful for projects outside of MiniJinja. Quite is quite a bit more hidden in it that I have talked about before such as how MiniJinja is abusing serde. If you have a need for a Jinja2 compatible template engine I would love if you get some use out of it. If you're curious about how to build a runtime and object system in Rust, you might also find some utility in the codebase.
I myself learned quite a bit about what creative API design can look like in Rust by building it. At this point I am incredibly happy with how the public API of the engine shaped out to be. The engine is extensively documented both internally and publicly and you can read all about it in the API docs.
GSoC '24 Update- Porting Arianna to Foliate-js
As my Google Summer of Code 2024 journey concludes, I'm excited to share the updates on my project: Porting Arianna to Foliate-js. The main goal was to replace the outdated epub.js with actively maintained Foliate-js. In my previous blog post, I discussed the initial progress on integrating Foliate-js into Arianna, including the implementation of Table of Contents (TOC) and metadata handling.
My work done so far Overcoming Rendering Challenges-
Rendering Issues: One of the major hurdles was fixing the rendering issues that were causing the book to not be visible on the screen. This was a complex problem, but with the guidance of my mentor,we were able to resolve it successfully, and the book was able to visible on the screen.
-
Text Color in Light Theme: I also addressed the text color issues in the light theme mode, ensuring other color can be visible and maintaining visual consistency across different themes.
-
Navigation Buttons: Enabled the navigation buttons by setting backend.locationsReady to true when the book is ready. This was a key fix to enhance user navigation within the ebook, to move from one page to another expect through the arrow keys.
-
Theme Color Handling: Lastly, I worked on the handling of theme colors to provide a consistent visual experience across different themes.
- Slider and Progress Percentage: I fixed the slider functionality, making sure it accurately reflects the reading progress. This update ensures that users can track their progress through the ebook with precision.
-
Reading Position Accuracy: I also ensured that the slider accurately reflects the reading position when users interact with it, improving the overall usability.
-
Book Progress Display: I resolved issues with the book progress display, update time left calculation, and popup behavior, refining the user interface for a smoother reading experience.
Project has been a significant learning experience. The most challenging part for me was making things work and realizing that not everything is as straightforward as I initially thought. It was daunting to dive into a large codebase and try to understand how everything fits together, but this experience taught me the importance of patience and how to reduce problems so they can be simply solved.
Looking AheadWhile significant progress has been made, many things are left to do:
- Fixing right-click copy and search functionality
- Implementing Ctrl+ shortcut for increasing font size
- Addressing link color and redirect issues
- Features like bookmarking and annotations
While my GSoC journey is coming to an end, my contributions to Arianna and the open-source community will continue.
I'd like to thank my mentor, Carl Schwan, for the guidance and support throughout the project, the KDE community, and the Google Summer of Code program for this opportunity. This experience has not only improved Arianna but has also been a transformative journey for me as a developer.
Thank you for following along with my progress, see you in my next blog with more progress.
Open Source AI – Weekly update August 26
As we move toward the release of the first-ever Open Source AI Definition in October at All Things Open, the publication of the 0.0.9 draft brings us one step closer to realizing this goal.
- OSAID 0.0.9 draft definition is live!
- Changelog includes:
- New Feature: Clarified Open Source Models and Weights
- Added a new paragraph under “What is Open Source AI” to define “system” as including both models and weights.
- Clarified that all components of a larger system must meet the standard.
- Updated paragraph after the “share” bullet to emphasize this point.
- New Section: Open Source Models and Open Source Weights
- Added descriptions of components for both models and weights in machine learning systems.
- Edited subsequent paragraphs to eliminate redundancy.
- Training Data: Defined as a Benefit, Not a Requirement
- Defined open, public, and unshareable non-public training data.
- Explained the role of training data in studying AI systems and understanding biases.
- Emphasized extra requirements for data to advance openness, especially in private-first areas like healthcare.
- Separation of Checklist
- The Checklist is now a separate document from the main Definition.
- Fully aligned Checklist content with the Model Openness Framework (MOF).
- Terminology Changes
- Replaced “Model” with “Weights” under “Preferred form to make modifications” for consistency.
- Explicit Reference to Recipients of the Four Freedoms
- Added specific references to developers, deployers, and end users of AI systems.
- Credits and References
- Incorporated credit to the Free Software Definition.
- Added references to conditions of availability of components, referencing the Open Source Definition.
- New Feature: Clarified Open Source Models and Weights
- Initial reactions on the forum:
- @shujisado praises the updates in version 0.0.9, particularly the decision to separate the checklist from the main document, which clarifies the intent behind OSAID. He also supports the separation of “code” and “weights,” noting that in Japan, “code” clearly falls under copyright, making this distinction logical. He acknowledges revisions in the checklist that consider the importance of complete datasets, even though he disagrees with making datasets mandatory.
- Comments on the draft on HackMD
- @Joshua Gay adds that instead of narrowing the focus to machine-learning systems, the emphasis should be on “parameters” as a whole since weights are just one type of parameter. He suggests a rewrite that highlights making model parameters, such as weights and other settings, available under OSI-approved terms, with examples across various AI models.
- He further suggests using broader language that covers more AI systems instead of narrower terminology. Specifically, he proposes replacing “Open Source models and Open Source weights” with “Open Source models and Open Source parameters,” and using “AI systems” instead of “machine learning systems.” Additionally, he recommends redefining an AI model to include architecture, parameters like weights and decision boundaries, and inference code, while referring to AI parameters as configuration settings that produce outputs from inputs.
- Under “Open Source models and Open Source weights”, @shujisado adds that the last paragraph titled “Open Source models and Open Source weights” actually explains “AI model” and “AI weights,” leading to a mismatch between the title and content, and notes that these terms are not used elsewhere in the definition.
- Under “Preferred form to make modifications to machine-learning systems”, @shujisado suggests some grammatical corrections.
- @Joshua Gay adds that instead of narrowing the focus to machine-learning systems, the emphasis should be on “parameters” as a whole since weights are just one type of parameter. He suggests a rewrite that highlights making model parameters, such as weights and other settings, available under OSI-approved terms, with examples across various AI models.
- Next steps
- The OSI has recently presented at the following events:
- Hong Kong for AI_dev, August 21-23
- Beijing for Open Source Congress, August 25-27.
- Iterate Drafts: Continue refining drafts with feedback from the worldwide roadshow, considering new dissenting opinions.
- Review Licenses: Decide on the best approach for reviewing new licenses for datasets, documentation, and model parameters.
- Enhance FAQ: Continue improving the FAQ to address emerging questions.
- Post-Stable Release Plan: Establish a process for reviewing and updating future versions of the Open Source AI Definition.
- The OSI has recently presented at the following events:
- Get involved:
- Join the forum and share your opinion.
- Leave a comment on the draft v.0.0.9 with precise feedback.
- Follow the weekly recaps and subscribe to our monthly newsletter.
- Join the town hall meetings: we’re increasing the frequency to weekly meetings where you can learn more, ask questions, and share your thoughts. The next is on September 6.
- Join the workshops and scheduled conferences
- @Kjetilk points out the legal distinction between using copyrighted works for AI training (reproduction) and incorporating them into publishable datasets, questioning the fairness of allowing exploitative models without compensation while potentially banning those that benefit society.
- @Shujisadoclarifies that compensation for copyrighted works used in AI training is possible for both open source and closed models, distinguishing it from “royalty,” and notes that Japan’s copyright law exempts such uses for machine learning.
- @Kjetilk reiterates the relevance of “royalty” for compensation in closed, non-published models, suggesting it makes sense under copyright law if required, but if not, it could benefit science and the arts.
Kate & Fonts
With the Qt 6.7 release, Qt introduced a wide range of improvement for the text rendering and font shaping.
One element of this is that you can now configure OpenType font features.
Many of the 'new cool' programming fonts have such features integrated. That includes both free fonts like Cascadia Code or paid fonts like MonoLisa.
Let's use the features of Cascadia Code as an example, that is the stuff they promote on their GitHub page:
For example if you set the ss01 feature, you get some alternative italics. The same holds for MonoLisa, there that is the feature ss02. Already that shows: these feature are often not very usefully named and very font specific.
Thanks to Waqar and me, with the upcoming KF 6.6 release, one will be able to configure that in Kate and other KTextEditor based applications.
The generic KDE Frameworks font chooser allows now to configure that stuff and KTextEditor will keep these settings around.
See here enabled alternative italics in Kate with the enhanced font chooser still open (look at the SPDX markers in the code):
A remaining issue is how to best handle the configuration saving in a more generic way. Ideas how to add that to KConfig without breaking compatibility of the configuration files we write with older applications would be welcome. For KTextEditor we just add some extra key for just the features that will be ignored by old versions.
Talking Drupal: Talking Drupal #464 - Drupal Content Production
Today we are talking about Producing content with Drupal, How Drupal can help content producers, and ways it could be better with guest Jerry Ta. We’ll also cover Stage File Proxy as our module of the week.
For show notes visit: www.talkingDrupal.com/464
Topics- Brief overview of Urban Institute using Drupal
- What are the day to day responsibilities of a content producer
- Layout Builder or Paragraphs
- What is your opinion
- You've been in content production for almost 2 decades, what was your first website editing tool.
- How long have you been using Drupal
- What is your number one wish the Drupal community would solve
- Drupalcon
- What value do you look for for a content producer
- What is the hardest part of using Drupal
- Starshot reaction
- Predictions for Drupal in 5 years for content producers
- Modules for replacing files on Drupal - , Media Entity File Replace, etc.
- Content Sync
- Tokens with CKEditor module
- Shortcode
- Common Spot
- Scheduled transitions
- Experience builder
- Starshot
Jerry Ta - joshmiller
HostsNic Laflin - nLighteneddevelopment.com nicxvan John Picozzi - epam.com johnpicozzi Josh Miller - joshmiller
MOTW CorrespondentMartin Anderson-Clutz - mandclu.com mandclu
- Brief description:
- Have you ever wanted to work on code or configuration changes to your Drupal site in a non-production environment, without having to copy over all the images and other content files? There’s a module for that.
- Module name/project name:
- Brief history
- How old: created in Jan 2011 by netaustin, by recent releases are by Stephen Mustgrave, who listeners will probably recognize from the Needs Review initiative, among his many other Drupal contributions
- Versions available: 7.x-1.10, 3.0.0-alpha2, and 3.1.0, the last of which works with Drupal 10.3 and 11
- Maintainership
- Actively maintained
- Security coverage
- Test coverage
- Documentation - not a lot, but it has been the subject of numerous blog posts over the years
- Number of open issues: 15 open issues, 2 of which are bugs against the current branch
- Usage stats:
- 16,710 sites
- Module features and usage
- Once you have Stage File Proxy site up on your non-production site, when the environment gets a request for a content file it doesn’t have like an image, it will query the production site to create a local copy
- It also has a mode where those requests are served 301 redirects to their location on the production server, so no files are ever copied
- Once you have the module installed, you can set the origin website URL using the admin UI, using a drush variable-set command, or you can add a line to your settings.php file.
- Also, if you have simple HTTP authentication set up on the site you want to pull from (for example using the Shield module), you can add URL-encoded versions of the username and password to the origin URL, and the module will still be able to copy down the files.
- This module was previously covered in this podcast way back in episode #33, but I thought it was worth bring back because it is so useful for working on site locally or across non-production environments
ImageX: Gutenberg Editor: an Alternative Approach to Creating Drupal Content Pages
Authored by Nadiia Nykolaichuk.
It’s great to have a choice of different options when it comes to creating content pages. In addition to Drupal core’s Layout Builder and CKEditor, you are always free to consider installing alternative contributed tools if that’s what resonates with your team’s preferences. One of the prominent examples is Drupal Gutenberg.
Members Newsletter – August 2024
The lively conversation about the role of data in building and modifying AI systems will continue as the OSI travels to China this month for AI_dev (August 21-23 in Hong Kong) and Open Source Congress (August 25-27 in Beijing). The OSI has been able to chime in on news stories on the topic, several of which are linked here in the newsletter.
Last month the OSI was at the United Nations in New York City for OSPOs for Good, an event that covered key areas of open source policy, as well as emerging examples of ‘Open Source for good’ from across the globe. I participated in a panel on Open Source AI.
Creating an Open Source AI Definition has been an arduous task over the past couple of years, but we know the importance of creating this standard so the freedoms to use, study, share and modify AI systems can be guaranteed. Those are the core tenets of Open Source, and it warrants the dedicated work it has required. Please read about the people who have played key roles in bringing the Definition to life in our Voices of Open Source AI Definition on the blog.
Stefano Maffulli
Executive Director, OSI
I hold weekly office hours on Fridays with OSI members: book time if you want to chat about OSI’s activities, if you want to volunteer or have suggestions.
News from the OSI OSI at the United Nations OSPOs for GoodFrom the Research and Advocacy program
Earlier this month the Open Source Initiative participated in the “OSPOs for Good” event promoted by the United Nations in NYC. Read more.
The Open Source Initiative joins CMU in launching Open Forum for AI: A human-centered approach to AI developmentFrom the Research and Advocacy program
The Open Source Initiative (OSI) is pleased to share that we are joining the founding team of Open Forum for AI (OFAI), an initiative designed by Carnegie Mellon University (CMU). Read more
GUAC adopts license metadata from ClearlyDefinedFrom the License and Legal program
The software supply chain just gained some transparency thanks to an integration of the Open Source Initiative (OSI) project, ClearlyDefined, into GUAC (Graph for Understanding Artifact Composition), an OpenSSF project from the Linux Foundation. Read more.
Better identifying conda packages with ClearlyDefinedFrom the License and Legal program
ClearlyDefined now provides a new harvester implementation for conda, a popular package manager with a large collection of pre-built packages for various domains, including data science, machine learning, scientific computing and more. Read more.
OSI in the news Can AI even be open source? It’s complicatedOSI at ZDNet
AI can’t exist without open source, but the top AI vendors are unwilling to commit to open-sourcing their programs and data sets. To complicate matters further, defining open-source AI is a messy issue that has yet to be settled. Read more.
Open Source AI: What About Data Transparency?OSI at The New Stack
AI uses both code and data, and this combination continues to be a challenge for open source, said experts at the United Nations OSPOs for Good Conference. Read more.
A new White House report embraces open-source AIOSI at ZDNet
The National Telecommunications and Information Administration (NTIA) issued a report supporting open-source and open models to promote innovation in AI, while emphasizing the need for vigilant risk monitoring. Read more.
With Open Source Artificial Intelligence, Don’t Forget the Lessons of Open Source SoftwareOSI at CISA
While there is not yet a consensus on the definition of what constitutes “open source AI”, the Open Source Initiative, which maintains the “Open Source Definition” and a list of approved OSS licenses, has been “driving a multi-stakeholder process to define an ‘Open Source AI’”. Read more.
Meta inches toward open source AI with new LLaMA 3.1OSI at ZDNet
Is Meta’s 405 billion parameter model really open source? Depends on who you ask. Here’s how to try out the new engine for yourself. Read more.
Other news News from OSI affiliates- Mozilla Foundation: Mozilla’s Policy Vision for the new EU Mandate: Advancing Openness, Privacy, Fair Competition, and Choice for all
- OASIS Open: The biggest names in AI have teamed up to promote AI security
- Apache Software Foundation, Eclipse Foundation, Linux Foundation: How open source attracts some of the world’s top innovators
- Eclipse Foundation: The Eclipse Foundation Announces Agenda and Keynote Speakers for Open Community Experience (OCX 2024), Europe’s Premier Event for Open Source Innovation
- Open Source takes center stage at United Nations
- Open Source in Europe: Facing the regulatory challenge
- Open Source projects vs products: A strategic approach
The Open Source Initiative (OSI) is running a series of stories about a few of the people involved in the Open Source AI Definition (OSAID) co-design process.
7th annual OSPO and Open Source Management SurveyThe TODO Group and Linux Foundation Research, in partnership with Cisco, NGINX, Open Source Initiative, InnerSource Commons, and CHAOSS, are excited to be launching the 7th annual OSPO and Open Source Management survey! Take survey here.
2024 Open Source Software Funding SurveyThis survey tries to better understand how organizations fund, contribute to, and support open source software projects. This survey is a collaboration between GitHub, Inc., the Linux Foundation, and researchers from Harvard University. Take survey here.
Events Upcoming events- AI_dev China (August 21-23, 2024 – Hong Kong)
- Open Source Congress (August 25-27, 2024 – Beijing)
- Open Source Summit Europe (September 16-18, 2024 – Vienna)
- Nerdearla Argentina (September 24-28, 2024 – Buenos Aires)
- SOSS Fusion (October 22-23, 2024 – Atlanta)
- Open Community Experience (October 22-24, 2024 – Mainz)
- All Things Open (October 27-29 – Raleigh)
- OpenForum Academy Symposium (November, 13-14, 2024 – Boston)
- Cisco
- Microsoft
- Bloomberg
- SAS
- Intel
- Look to the right
Interested in sponsoring, or partnering with, the OSI? Please see our Sponsorship Prospectus and our Annual Report. We also have a dedicated prospectus for the Deep Dive: Defining Open Source AI. Please contact the OSI to find out more about how your company can promote open source development, communities and software.
Support OSI by becoming a member!Let’s build a world where knowledge is freely shared, ideas are nurtured, and innovation knows no bounds!
The Drop Times: 'Drupal at Your Fingertips' Is Designed as a Quick Reference for Experienced Developers: Selwyn Polit
Real Python: How to Install Python on Your System: A Guide
Installing the latest version of Python on your computer could be a common requirement for you as a Python programmer. Fortunately, you’ll have a multitude of installation options. For example, you can download the official Python installer from Python.org, use your operating system’s package manager or app store, and more.
In this tutorial, you’ll focus on official CPython distributions, which are generally the best option for learning to program with the language. However, you’ll also learn about a few other distributions, like the one available on Homebrew for macOS users.
In this tutorial, you’ll learn how to:
- Check whether a version of Python is installed on your system
- Install or update to the latest Python on Windows, macOS, and Linux
- Install Python on mobile devices like phones or tablets
- Use Python on your browser with online interpreters
This tutorial covers installing the latest Python on the most important platforms or operating systems, such as Windows, macOS, Linux, iOS, and Android. However, it doesn’t cover all the existing Linux distributions, which would be a huge task. Anyway, you’ll find instructions for the most popular distros nowadays.
To get the most out of this tutorial, you should be comfortable using your operating system’s terminal or command line.
Free Bonus: Click here to get a Python Cheat Sheet and learn the basics of Python 3, like working with data types, dictionaries, lists, and Python functions.
Take the Quiz: Test your knowledge with our interactive “Python Installation and Setup” quiz. You’ll receive a score upon completion to help you track your learning progress:
Interactive Quiz
Python Installation and SetupIn this quiz, you'll test your understanding of how to install or update Python on your computer. With this knowledge, you'll be able to set up Python on various operating systems, including Windows, macOS, and Linux.
Windows: How to Check or Get PythonIn this section, you’ll learn to check whether Python is installed on your Windows operating system (OS) and which version you have. You’ll also explore three installation options that you can use on Windows.
Note: In this tutorial, you’ll focus on installing the latest version of Python in your current operating system (OS) rather than on installing multiple versions of Python. If you want to install several versions of Python in your OS, then check out the Managing Multiple Python Versions With pyenv tutorial. Note that on Windows machines, you’d have to use pyenv-win instead of pyenv.
For a more comprehensive guide on setting up a Windows machine for Python programming, check out Your Python Coding Environment on Windows: Setup Guide.
Checking the Python Version on WindowsTo check whether you already have Python on your Windows machine, open a command-line application like PowerShell or the Windows Terminal.
Follow the steps below to open PowerShell on Windows:
- Press the Win key.
- Type PowerShell.
- Press Enter.
Alternatively, you can right-click the Start button and select Windows PowerShell or Windows PowerShell (Admin). In some versions of Windows, you’ll find Terminal or Terminal (admin).
Note: To learn more about your options for the Windows terminal, check out Using the Terminal on Windows.
With the command line open, type in the following command and press the Enter key:
Windows PowerShell PS> python --version Python 3.x.z Copied!Using the --version switch will show you the installed version. Note that the 3.x.z part is a placeholder here. In your machine, x and z will be numbers corresponding to the specific version you have installed.
Alternatively, you can use the -V switch:
Windows PowerShell PS> python -V Python 3.x.z Copied!Using the python -V or python—-version command, you can check whether Python is installed on your system and learn what version you have. If Python isn’t installed on your OS, you’ll get an error message.
Knowing the Python Installation Options on Windows Read the full article at https://realpython.com/installing-python/ »[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]
Ned Batchelder: Coverage branches instead of arcs
As I mentioned in a few recent posts, I’ve been working on some significant work in coverage.py to take advantage of new capabilities in Python.
Mark Shannon has been improving the sys.monitoring API so that branch coverage can be done with low overhead. I want to take advantage of that in coverage.py, but I needed to do some refactoring work first. The tests were focused on mapping the complete set of code pathways (which I called arcs), but using low-overhead branch monitoring won’t provide those complete pathways. If the tests continued to focus on them, they would fail with sys.monitoring.
But the complete pathways aren’t actually needed. The useful information is where the branches are, and which branches were taken. That can be measured with sys.monitoring. So a first step was to refactor the tests to focus on branches instead of arcs. That took a while, but is now done.
Not needing all those arcs also meant I could simplify the AST-based parser that found the arcs, removing about 150 lines. I suspect there’s more that could be removed. Maybe it will happen over time. Also, the new code.co_branches() method might make it all obsolete over time.
If you read Coverage at a crossroads on this blog, I talked about using ideas from SlipCover like inserting fake lines with an import hook. Those exotic ideas were appealing in their way, but are no longer needed, and they would have brought a bunch of complexity. With the two new sys.monitoring events, we can get the branch information directly without advanced shenanigans.
There’s more work to do, including attending to incoming bug reports. If you’d like to help, or learn more about any of this, we have a #coverage-py channel in the Python Discord.
The Drop Times: Out-of-the-Box Functionality Survey Reveals the Community's Enthusiasm for Starshot
The Drupal community has taken another step forward under the Starshot Initiative. Recently, the team concluded a survey aimed at pinpointing the most desired out-of-the-box features and contributed modules for the upcoming ‘Drupal CMS’. This survey targeted ambitious marketers as part of the broader Drupal Starshot strategy, resulting in 60 detailed submissions and over 100 feature suggestions. These insights, now available on Drupal.org thanks to Pamela Barone's announcement, will play a crucial role in shaping the platform’s future.
The feedback received from the survey highlights a strong community interest in several key areas. Among the most frequently mentioned were enhancements to page-building tools, SEO capabilities, improved form builders, and content management functionalities. The desire for better security, media management, and multilingual support also stood out as significant themes. Interestingly, while many of these suggestions align with existing development initiatives, the survey also introduced several fresh ideas that are now under consideration by the Drupal leadership team.
Particularly noteworthy are the suggestions for modules that could elevate Drupal’s out-of-the-box experience. Modules like Metatag, Webform, and Admin Toolbar were repeatedly mentioned and are now being evaluated for possible inclusion in future releases. These modules, known for their functionality and ease of use, could significantly enhance the user experience if integrated into the out-of-the-box Drupal CMS offering.
While the survey is not being treated as a direct vote, it serves as a powerful validation tool. The results ensure that the Drupal development tracks are closely aligned with the needs and expectations of its community. As the leadership team assesses these suggestions, they are keenly aware of the balance between innovation and the consistency of user experience that Drupal is known for.
Curious about the detailed findings and how they might shape the next generation of Drupal? You can dive deeper into the survey results here: Community Demands Enhanced Out-of-the-Box Features in DrupalCMS. As the Starshot Initiative continues to gather momentum, the community eagerly awaits the next steps in this exciting journey.
As we turn our attention to the latest from The Drop Times, the focus has been on the ongoing Drupal Association Board Elections. As part of their "Meet the Candidate" campaign, several candidates have shared their visions and plans for Drupal's future.
Matthew Saunders discusses his candidacy in an interview with Alka Elizabeth, a sub-editor at The Drop Times. Focusing on improving governance, fostering inclusivity, and supporting neurodiverse individuals, Matthew outlines his motivations for running for the Drupal Association Board. His ideas provide valuable insights for voters as the election progresses.
Kevin Quillen, Practice Lead at Velir, brings over 16 years of experience to his candidacy. In his interview with Alka Elizabeth, Kevin emphasizes the importance of modernizing Drupal.org, attracting new developers, and enhancing Drupal's global appeal. His vision for the future could significantly impact the platform’s evolution.
Albert Hughes, Product Owner at Stanford University, offers a unique perspective on expanding Drupal’s reach. His candidacy is grounded in his diverse experiences and a strong commitment to innovation. As the election continues, Albert’s ideas for growth and development resonate with many in the community.
In the final installment of The Drop Times' campaign, Alejandro Moreno Lopez, Partner Manager and Developer Relations at Pantheon, shares his journey within the Drupal community. Alejandro is passionate about reducing the Association's dependency on DrupalCon and fostering collaboration and innovation. His interview provides a compelling case for his candidacy as voting continues until September 5th.
Discover why Drupal's latest product will be called 'Drupal CMS' and not just 'Drupal.' An insightful article authored by Sebin A Jacob, Editor-in-Chief of The Drop Times, explore the strategic decision-making, community feedback, and future implications behind this significant naming shift that redefines the way we think about Drupal's evolution.
The Drupal Decoupled project, also known as headless Drupal, has introduced a new feature to simplify the adoption and implementation of decoupled architecture. This project, which separates the back-end content management from the front-end display, now leverages "Recipes" and can be easily adopted as a Composer Project Template. Jesús Manuel Olivas, Co-Founder and CEO of Octahedroid and Composabase, recently announced this update.
Morpht has launched its "Content Recommendation Playbook," showcasing how personalized content recommendations using Recombee's service can enhance user experiences. The playbook explains how to integrate these systems into Drupal and GovCMS to deliver tailored content based on user behavior, boosting engagement.
During DrupalCon Portland 2022, concerns over the sustainability of free software led to the conception of Drupal Forge, a platform aimed at financially supporting project maintainers. The idea, sparked by Webform developer Jacob Rockowitz, was further developed by Darren Oh, who proposed adding a launch button for trial sites on project pages to generate recurring revenue. While the initiative has garnered interest, challenges remain in implementing and scaling this solution.
Sponsorship opportunities for BADCamp 2024, set for October 24-25 in Oakland, California, are now open, offering extensive visibility to organizations within the Drupal community. With packages ranging from $1,000 to $2,000, sponsors can gain exposure through speaking engagements, branding at summits, and hosting social events.
Chattanooga Open Source Camp, featuring DrupalCamp Chattanooga 2024, seeks sponsors for its November 2nd event at Chattanooga State Community College. Sponsorships range from $20 to $2,000, offering opportunities for businesses to gain visibility within the tech community. In-kind sponsorships are also welcomed, with a total event budget of $6,500.
The Drop Times has been named the official Media Partner for DrupalCamp Pune 2024, set for October 19-20 at Yashada, Pune. This partnership will ensure comprehensive coverage of the event, featuring sessions, workshops, and keynotes from industry leaders. Organized by the Drupalers Association Pune, the camp aims to foster innovation, learning, and networking within the Drupal community.
The Splash Awards will debut in Asia during DrupalCon Singapore 2024, with submissions open until September 27. The prestigious event, recognizing excellence in Drupal web development, will culminate in a ceremony on December 9 at the Garden Ballroom, PARKROYAL Collection Marina Bay.
The Drupal CEO Network and the Drupal Association have extended the deadline for the 2024 Drupal Business Survey to September 4th. This annual survey gathers crucial insights from Drupal business leaders, shaping an anonymized industry report to guide strategic decisions. The results will be unveiled at DrupalCon Barcelona 2024, with discussions set for September 25 and 26.
The Aten Design Group will host an online session on August 28, 2024, at 2:00 PM EDT to discuss the recent release of Drupal 11. Seth Hill, Senior Developer at Aten, will lead the session designed for Drupal site owners, content administrators, and developers who want to learn more about the new version and its potential benefits.
We acknowledge that there are more stories to share. However, due to selection constraints, we must pause further exploration for now.
To get timely updates, follow us on LinkedIn, Twitter and Facebook. You can also, join us on Drupal Slack at #thedroptimes.
Thank you,
Sincerely
Kazima Abbas
Sub-editor, The DropTimes.
FrOScon 2024
This year, I attended FrOScon for the first time . FrOScon is the biggest conference about free and open-source software in Germany. It takes place every year in Bonn/Siegburg (Germany) at the weekend and is free to attend.
For the first time, I was not at a conference to staff a KDE stand. My employer had a stand there, and it was a great occasion for me to meet some colleagues, fellow KDE, and Matrix contributors.
So I spent the majority of my time at the GnuPG stand and discussing many things with Volker, including KDE PIM and the future of KWallet.
I also meet many Matrix community members and am excited to attend the Matrix Conference next month.
All in one, it was a great conference and I hope to see more KDE people there next year and maybe even having out own KDE stand.
Python Bytes: #398 Open source makes you rich? (and other myths)
Zato Blog: Integrating with Jira APIs
Continuing in the series of articles about newest cloud connections in Zato 3.2, this episode covers Atlassian Jira from the perspective of invoking its APIs to build integrations between Jira and other systems.
There are essentially two use modes of integrations with Jira:
- Jira reacts to events taking place in your projects and invokes your endpoints accordingly via WebHooks. In this case, it is Jira that explicitly establishes connections with and sends requests to your APIs.
- Jira projects are queried periodically or as a consequence of events triggered by Jira using means other than WebHooks.
The first case is usually more straightforward to conceptualize - you create a WebHook in Jira, point it to your endpoint and Jira invokes it when a situation of interest arises, e.g. a new ticket is opened or updated. I will talk about this variant of integrations with Jira in a future instalment as the current one is about the other situation, when it is your systems that establish connections with Jira.
The reason why it is more practical to first speak about the second form is that, even if WebHooks are somewhat easier to reason about, they do come with their own ramifications.
To start off, assuming that you use the cloud-based version of Jira (e.g. https://example.atlassian.net), you need to have a publicly available endpoint for Jira to invoke through WebHooks. Very often, this is undesirable because the systems that you need to integrate with may be internal ones, never meant to be exposed to public networks.
Secondly, your endpoints need to have a TLS certificate signed by a public Certificate Authority and they need to be accessible on port 443. Again, both of these are something that most enterprise systems will not allow at all or it may take months or years to process such a change internally across the various corporate departments involved.
Lastly, even if a WebHook can be used, it is not always a given that the initial information that you receive in the request from a WebHook will already contain everything that you need in your particular integration service. Thus, you will still need a way to issue requests to Jira to look up details of a particular object, such as tickets, in this way reducing WebHooks to the role of initial triggers of an interaction with Jira, e.g. a WebHook invokes your endpoint, you have a ticket ID on input and then you invoke Jira back anyway to obtain all the details that you actually need in your business integration.
The end situation is that, although WebHooks are a useful concept that I will write about in a future article, they may very well not be sufficient for many integration use cases. That is why I start with integration methods that are alternative to WebHooks.
Alternatives to WebHooksIf, in our case, we cannot use WebHooks then what next? Two good approaches are:
- Scheduled jobs
- Reacting to emails (via IMAP)
Scheduled jobs will let you periodically inquire with Jira about the changes that you have not processed yet. For instance, with a job definition as below:
Now, the service configured for this job will be invoked once per minute to carry out any integration works required. For instance, it can get a list of tickets since the last time it ran, process each of them as required in your business context and update a database with information about what has been just done - the database can be based on Redis, MongoDB, SQL or anything else.
Integrations built around scheduled jobs make most sense when you need to make periodic sweeps across a large swaths of business data, these are the "Give me everything that changed in the last period" kind of interactions when you do not know precisely how much data you are going to receive.
In the specific case of Jira tickets, though, an interesting alternative may be to combine scheduled jobs with IMAP connections:
The idea here is that when new tickets are opened, or when updates are made to existing ones, Jira will send out notifications to specific email addresses and we can take advantage of it.
For instance, you can tell Jira to CC or BCC an address such as zato@example.com. Now, Zato will still run a scheduled job but instead of connecting with Jira directly, that job will look up unread emails for it inbox ("UNSEEN" per the relevant RFC).
Anything that is unread must be new since the last iteration which means that we can process each such email from the inbox, in this way guaranteeing that we process only the latest updates, dispensing with the need for our own database of tickets already processed. We can extract the ticket ID or other details from the email, look up its details in Jira and the continue as needed.
All the details of how to work with IMAP emails are provided in the documentation but it would boil down to this:
# -*- coding: utf-8 -*- # Zato from zato.server.service import Service class MyService(Service): def handle(self): conn = self.email.imap.get('My Jira Inbox').conn for msg_id, msg in conn.get(): # Process the message here .. process_message(msg.data) # .. and mark it as seen in IMAP. msg.mark_seen()The natural question is - how would the "process_message" function extract details of a ticket from an email?
There are several ways:
- Each email has a subject of a fixed form - "[JIRA] (ABC-123) Here goes description". In this case, ABC-123 is the ticket ID.
- Each email will contain a summary, such as the one below, which can also be parsed:
- Finally, each email will have an "X-Atl-Mail-Meta" header with interesting metadata that can also be parsed and extracted:
The first option is the most straightforward and likely the most convenient one - simply parse out the ticket ID and call Jira with that ID on input for all the other information about the ticket. How to do it exactly is presented in the next chapter.
Regardless of how we parse the emails, the important part is that we know that we invoke Jira only when there are new or updated tickets - otherwise there would not have been any new emails to process. Moreover, because it is our side that invokes Jira, we do not expose our internal system to the public network directly.
However, from the perspective of the overall security architecture, email is still part of the attack surface so we need to make sure that we read and parse emails with that in view. In other words, regardless of whether it is Jira invoking us or our reading emails from Jira, all the usual security precautions regarding API integrations and accepting input from external resources, all that still holds and needs to be part of the design of the integration workflow.
Creating Jira connectionsThe above presented the ways in which we can arrive at the step of when we invoke Jira and now we are ready to actually do it.
As with other types of connections, Jira connections are created in Zato Dashboard, as below. Note that you use the email address of a user on whose behalf you connect to Jira but the only other credential is that user's API token previously generated in Jira, not the user's password.
Invoking JiraWith a Jira connection in place, we can now create a Python API service. In this case, we accept a ticket ID on input (called "a key" in Jira) and we return a few details about the ticket to our caller.
This is the kind of a service that could be invoked from a service that is triggered by a scheduled job. That is, we would separate the tasks, one service would be responsible for opening IMAP inboxes and parsing emails and the one below would be responsible for communication with Jira.
Thanks to this loose coupling, we make everything much more reusable - that the services can be changed independently is but one part and the more important side is that, with such separation, both of them can be reused by future services as well, without tying them rigidly to this one integration alone.
# -*- coding: utf-8 -*- # stdlib from dataclasses import dataclass # Zato from zato.common.typing_ import cast_, dictnone from zato.server.service import Model, Service # ########################################################################### if 0: from zato.server.connection.jira_ import JiraClient # ########################################################################### @dataclass(init=False) class GetTicketDetailsRequest(Model): key: str @dataclass(init=False) class GetTicketDetailsResponse(Model): assigned_to: str = '' progress_info: dictnone = None # ########################################################################### class GetTicketDetails(Service): class SimpleIO: input = GetTicketDetailsRequest output = GetTicketDetailsResponse def handle(self): # This is our input data input = self.request.input # type: GetTicketDetailsRequest # .. create a reference to our connection definition .. jira = self.cloud.jira['My Jira Connection'] # .. obtain a client to Jira .. with jira.conn.client() as client: # Cast to enable code completion client = cast_('JiraClient', client) # Get details of a ticket (issue) from Jira ticket = client.get_issue(input.key) # Observe that ticket may be None (e.g. invalid key), hence this 'if' guard .. if ticket: # .. build a shortcut reference to all the fields in the ticket .. fields = ticket['fields'] # .. build our response object .. response = GetTicketDetailsResponse() response.assigned_to = fields['assignee']['emailAddress'] response.progress_info = fields['progress'] # .. and return the response to our caller. self.response.payload = response # ########################################################################### Creating a REST channel and testing itThe last remaining part is a REST channel to invoke our service through. We will provide the ticket ID (key) on input and the service will reply with what was found in Jira for that ticket.
We are now ready for the final step - we invoke the channel, which invokes the service which communicates with Jira, transforming the response from Jira to the output that we need:
$ curl localhost:17010/jira1 -d '{"key":"ABC-123"}' { "assigned_to":"zato@example.com", "progress_info": { "progress": 10, "total": 30 } } $And this is everything for today - just remember that this is just one way of integrating with Jira. The other one, using WebHooks, is something that I will go into in one of the future articles.
More resources➤ Python API integration tutorial
➤ What is an integration platform?
➤ Python Integration platform as a Service (iPaaS)
➤ What is an Enterprise Service Bus (ESB)? What is SOA?
Haruna 1.2.0
Haruna version 1.2.0 is out with a new footer style.
Availability of other package formats depends on your distro and the people who package Haruna.
Windows version:
- haruna-1.2.0-windows-gcc-x86_64.exe
- haruna-1.2.0-windows-gcc-x86_64.7z
- haruna-1.2.0-windows-gcc-x86_64-dbg.7z
If you like Haruna then support its development: GitHub Sponsors | Liberapay | PayPal
Feature requests and bugs should be posted on bugs.kde.org, but for bugs make sure to fill in the template and provide as much information as possible.
Changelog: 1.2.0- Added floating footer/bottom toolbar style with 2 ways to trigger it:
- on every mouse movement of the video area
- only when the mouse is in the lower part of the video area
- Removed the docbook and moved its content to tooltips
- Middle clicking the playlist scrolls to the playing item
KDE Goals - Our Cumulative Culture
Every two years, the KDE community selects three goals that serve as focal points for the entire community's efforts in the coming years. This cyclical process of goal-setting and community-wide focus is a great example of KDE's Cumulative Culture in action.
This concept, typically observed in human societies, refers to the ability to build upon previous knowledge and innovations to create increasingly complex and effective solutions. In KDE's case, each cycle of goals represents a new layer of accumulated wisdom, i.e. new features and more stability.
The First Cycle (2018-2020)The first cycle of goals laid the groundwork with its focus on community growth, privacy, and usability.
- Streamlined Onboarding: Focused on attracting and retaining new contributors by making the onboarding process smoother and more engaging.
- Privacy Software: Prioritized user privacy and security, ensuring KDE software respects user data and complies with security standards.
- Usability & Productivity: Aimed to enhance the usability and productivity of KDE software, making it powerful yet easy to use.
The second cycle tackled more complex challenges. Goals like Wayland implementation improvements (which layed the foundation for the Plasma 6 release), improving the app ecosystem, and ensuring consistency in design and functionality.
- Wayland: This task aimed at stabilizing Wayland support accross KDE apps.
- All About the Apps: Improved KDE's app infrastructure, enabling more efficient app delivery and better support services.
- Improve Consistency across the Board: Ensured uniformity in design and functionality across KDE software, improving usability and reducing redundancy.
The third cycle, which is currently coming to an end, was about progress and adaptation. A focus to include environmental responsibility, operational efficiency, and inclusive design.
- Sustainable Software: Focused on making KDE software more energy-efficient and environmentally friendly by implementing practices that reduce resource consumption and ensure long-term sustainability.
- Automate and Systematize Internal Processes: Aimed to streamline KDE’s internal workflows by automating repetitive tasks, adding code tests across projects and creating a Quality Assurance team to name a few.
- KDE For All: Seeked to make KDE software accessible and inclusive for all users.
Now, as we enter the fourth cycle of the KDE Goals, we see the full power of this cumulative process. Each goal, whether fully achieved or not, contributes to the collective knowledge and capability of the KDE community. Ideas and partial solutions from past cycles become a solid foundation of knowledge and experience that support future efforts.
The commmunity is currently voting on the following proposals for the next KDE Goals cycle that will guide our efforts and shape our focus for the coming years:
- Enhancing control and automation: integrate KDE Plasma (and apps) with Smart Home Ecosystems
- Freedom through Better Data and Workflow Organization and Management
- KDE Needs You! 🫵 - Formalise and boost KDE's processes for recruiting active contributors
- KDE-based Text Snippet Expansion
- Sandbox all the things!
- Plasma - A Beacon for Open Design
- Refining and Enriching KDE: Empowering Users with Convenient and Intuitive Features
- Streamlined Application Development Experience
- Unify the Plasma experience
- We care about your Input
The three most voted goals will be announced at Akademy, where there will also be a wrap-up talk about the achievements of the current goals. Also, there will be Birds-of-a-feather (BoF) sessions with the new goal champions.
Join the Matrix room and keep an eye on the website for the latest KDE Goals updates.
Matt Layman: Layman's Guide to Python Built-in Functions
What's New In The Revised Blue Angel Criteria
KDE's Okular is the first software which got awarded with the Blue Angel label for resource and energy-efficient software products. The certification was based on the first version of the criteria for this product criteria which were introduced in 2020. Now the criteria have been updated. What has changed and what does that mean for KDE?
The revised criteria are available as version 4 on the Blue Angel web site. Only the German version is currently available; the English version will follow shortly.
New software categoriesThe biggest change is the scope of the label. In the past it was limited to desktop software. With the updated version, the criteria also include software on mobile devices and server software or a combination of these categories, such as a web service with mobile and desktop clients.
The biggest challenge is the measurement of the energy and resource efficiency for these new categories, which requires a more flexible approach and must accommodate scenarios where the measurement cannot be done by inserting a meter in front of the power supply of a single device. The new criteria address this by defining applicable methods for the measurement of mobile and server applications.
The extended scope covers a much broader range of software. For KDE the desktop category is most relevant, but of course a lot of software also interacts with a server component, for example an email client like KMail, which could now be treated and assessed as a combined client-server system to give more realistic and relevant results.
More flexible measurement procedureThe expansion in scope requires an expanded view on the measurement of energy and resource efficiency as well. The first version of the criteria was quite strict and prescribed a very specific measurement procedure on specified reference systems. It was based on a comparison of measurements in a representative usage scenario and in idle mode. This gave a realistic impression of what the usage of a computer program meant in terms of energy consumption.
The new criteria allow for more variation in how the measurements are carried out. The original method is still there, but variations which lead to comparable results are possible as well. This change means that a new criterion was introduced to document the way measurements are done.
In addition to the measurement of the usage scenario, a new type of measurement was introduced. This measures total energy consumption of a production system over a longer period of time. This is particular useful for server applications, where this method can lead to more realistic numbers by averaging resource consumption over real-world usage of multiple users.
For mobile applications, the measurement also has to include the data volume transmitted during a standard usage scenario and the list of URLs it has accessed. This is based on the assumption that large volumes of data transfer imply a higher energy usage. It can also be used to assess if the application is using advertisements or is collecting tracking information. Both are forbidden under the revised Blue Angel criteria.
Ongoing assessment of energy and resource efficiencyThe original criteria demanded that updates of the software still run on old reference systems and that the energy consumption does not increase more than 10%. They were not very clear in how exactly this should be proven and documented. Especially for software which is released very often, testing every individual update is impractical. For mobile and even more for server software, update cycles can be very short, up to multiple updates a day.
In the updated criteria there is a more precise way of handling updates. The general idea is still there that updated software run on old hardware and energy consumption not increase too much. But it's not tied to individual updates anymore. The required procedure is to do a measurement at least once a year and publish the results as part of the documentation of the software product. This includes documentation of the measurement setups and any changes to it as well as preserving the history of measurements, so that users can judge for themselves how much energy and resource usage is increasing over time.
This procedure clarifies the requirments and opens a pragmatic way of measuring updates. It implies a certain burden on updating documentation.
Consequences for KDE and OkularKDE holds the Blue Angel label for its PDF viewer Okular. This is desktop software and the standard usage scenario doesn't include any network access. That means that the expanded scope does not change anything for the existing certification. The revised criteria open up the opportunity to apply for the Blue Angel label for mobile software, such as KDE Connect, and mixed scenarios which also include server components, but the eco-certification for Okular is covered as it was before.
The more flexible measurement criteria give us more leeway in how we are doing the measurements. We have set up KEcoLab for being able to regularly do measurements. This setup follows the procedure prescribed in the original criteria. As this is still valid, it also means no change for us, and our measurements still fulfill the criteria. However, it gives us more opportunities to improve the lab and doesn't strictly tie us to the original list of reference systems anymore. We might want to take advantage of that.
The documentation of the measurement system is something we have always done in a transparent way, so this also doesn't require any big changes on our side. We have to consider how to best convey this in the documentation of Okular, but this is mostly a question on how we communicate the existing content.
The ongoing assessment of energy and resource efficiency ties very well into how we handle software updates. We have a continuous release stream with frequent updates and incremental changes. This fits the model of the new criteria. We have to review how we include regular updates of the documentation and measurement data in releases, but this again is mostly a question of how we communicate the existing content.
ConclusionThe revised criteria provide a welcome expansion of the Blue Angel to more categories of software and a more flexible way to do energy and resource efficiency measurements. They continue to align well with how KDE develops software in general and Okular in particular, so we do not see any issues with continuing the Blue Angel certification for Okular.
We would be happy if the new version of the criteria would increase adoption of the Blue Angel ecolabel for resource and energy efficient software. Sustainable software is an important topic and the Blue Angel can be one way of making progress in this area more visible to a broad audience.
NextCloudPi on Raspberry Pi 5
I finally took an evening to get NextCloudPi installed on a Raspberry Pi 5 with a large-ish NVMe drive. This was not a smooth ride. For your pleasure, this is how I got it working.
First, use Jeff Geerling’s guide to get the Pi booting from the NVMe drive.
Second, use this guide to move from Debian networking to systemd-networkd, but do not hold the avahi-daemon package.
Third, run the NextCloudPi curl install script.
Next up – the migration from my old instance. I have 1.5TB of files on a spin disk connected via USB that I need to move to the new NVMe storage – but that is for another night.
For the record – I do love NextCloud and NextCloudPi, so no finger pointing here, just sharing some frustration and how I got around the issue.