FLOSS Project Planets
Performance of data import in LabPlot
In many cases, importing data into LabPlot for further analysis and visualization is the first step in the application:
LabPlot supports many different formats (CSV, Origin, SAS, Stata, SPSS, MATLAB, SQL, JSON, binary, OpenDocument Spreadsheets (ods), Excel (xlsx), HDF5, MQTT, Binary Logging Format (BLF), FITS, NetCDF, ROOT (CERN), LTspice, Ngspice) and we plan to add support for even more formats in the future. All of these formats have their reasons for existence as well as advantages and disadvantages. However, the performance of reading the data varies greatly between the different formats and also between the different CPU generations. In this post, we’ll show how long it takes to import a given amount of data in four different formats – ASCII/CSV, Binary, HDF5, and netCDF.
This post is not about promoting any of the formats, nor is it about doing very sophisticated measurements with different amounts and types of data and extensive CPU benchmarking. Rather, it’s about what you can (roughly) expect in terms of performance on the new and not so new hardware with the current implementation in LabPlot.
For this exercise, we import the data set with 1 integer column and 5 columns of float values (Brownian motion for 5 “particles”, one integer column for the index) with 50 Millions of rows which results into 300 Millions of numerical values:
We take 6 measurements for each format, ignore the first measurement, which is almost always an outlier due to the disk cache in the kernel and results in faster file reads on subsequent accesses, and calculate the averages:
As expected, the file formats that deal with binary representation internally (Binary, HDF5, NetCDF) provide significantly better performance compared to ASCII, and this difference becomes larger the slower the CPU is. The performance of HDF5 and NetCDF is almost the same because the newer version of NetCDF is based on HDF5.
The implementation in the data import code is straightforward. Ignoring for a moment the complexity with the different options affecting the behavior of the parser, different data types and other subleties, once everything is set up it’s just a matter of iterating over the data, parsing it and converting it into the internal structures. The logic inside the loop is fixed, and a linear behavior with respect to the total number of values to read is expected. This expectation is confirmed using the same CPU (we took the fastest CPU from the table above) and varying the total number of rows with the fixed number of columns:
The performance of the import is even more critical when dealing with external data that is frequently modified. In order to provide a smooth visualization in LabPlot for such “live data”, it’s important to optimize all steps involved here, like the import of the new data itself, as well as the recalculation in the algorithms (smoothing, etc.) and in the visualization part. For the next release(s), we’re now working to further optimize the implementation to handle more performance-critical scenarios in the near future. The results of these activities, funded by the NLnet grant, will be the subject of a dedicated post soon.
Talk Python to Me: #484: From React to a Django+HTMX based stack
Tryton News: Tryton Release 7.4
We are proud to announce the 7.4 release of Tryton .
This release provides many bug fixes, performance improvements and some fine tuning.
You can give it a try on the demo server, use the docker image or download it here.
As usual upgrading from previous series is fully supported.
Here is a list of the most noticeable changes:
Changes for the User ClientsThe Many2Many widget now has a restore button to revert the removal of records before saving.
The CSV export window stays open after the export is done so you can refine your export without having the redo all of the configuration.
It also supports exporting and importing translatable fields with a language per column.
The error messages displayed when there is a problem with the CSV import have been improved to include the row and column number of the value that caused the error.
The management window for the favourites has been removed and replaced by a simple “last favorite first” order.
The focus goes back to the search entry after performing a search/refresh.
You can now close a tab by middle clicking on it (as is common in other software).
Web ClientThe left menu and the attachment preview can now be resized so the user can make them the optimal size for their screen.
AccountingThe minimal chart of accounts has been replaced by the a universal chart of accounts which is a good base for IFRS and US GAAP.
It is now possible to copy an accounting move from a closed period. The closed period will be replaced by the current period after accepting the warning.
The payments are now numbered to make it easier to identify them inside the application.
An option has been added to the parties to allow direct debits to be created based on the balance instead of the accounting lines.
We’ve added a button on the Stripe payments and Stripe and Braintree customers to allow an updated to be forced. This helps when fixing missed webhooks.
When a stock move is cancelled, the corresponding stock account move is now cancelled automatically.
But it now no longer possible to cancel a done stock move which has been included in a calculation used for anglo-saxon accounting.
It is now possible to deactivate an agent so that they are no longer used for future orders.
CompanyIt is now possible to add a company logo. This is then displayed in the header of generated documents.
IncotermA warning is now raised when the incoterm of a shipment is different from the original document (such as the sale or purchase).
PartyWe’ve added more identifiers for parties like the United Kingdom Unique Taxpayer Reference, Taiwanese Tax Number, Turkish tax identification number, El Salvador Tax Number, Singapore’s Unique Entity Number, Montenegro Tax Number and Kenya Tax Number.
ProductWe’ve added a wizard to manage the replacement of products. Once there is no more stock of the replaced product in any of the warehouses, all the stock on all pending orders are replaced automatically.
A description can now be set for each product image.
There is now a button on the price list form to open the list of lines. This is helpful when the price list has a lot of lines.
ProductionIt is now possible to cancel a done production. All its stock moves are then cancelled.
The Bill of Materials now have an auto-generated internal code.
PurchaseThe wizard to handle exceptions has been improved to clearly display the list of lines to recreate and the list of lines to ignore.
The menu entry Parties associated to Purchases has been removed in favour of the per party reporting.
The purchase amendment now supports amending the quantity of a purchase line using the secondary unit.
QualityIt is now no longer possible to delete non-pending inspections.
SaleThe wizards to handle exceptions have been improved to clearly display the list of lines to recreate and the list of lines to ignore.
The menu entry Parties associated to Sales has been removed in favor of the per party reporting.
A warning is now raised when the user tries to submit a complaint for the same origin as an existing complaint.
The reporting can be grouped per promotion.
From a promotion, it is now possible to list of the sales related to it.
The coupon number of promotion can now be reused once the previous promotion has expired.
The sale amendment now supports amending the quantity of a sale line using the secondary unit.
StockIt is now possible to cancel a done shipment. When this happens the stock moves of the shipment are cancelled.
The task to reschedule late shipments now includes any shipment that is not yet done.
The supplier shipments no longer have a default planned date.
The customer shipments now have an extra state, Shipped, before the Done state.
The lot trace now shows the inventory as a document.
The package weight and the warehouse are now criteria that can be used when selecting a shipping method.
Changes for the System AdministratorThe clients automatically retry 5 times on a 503 Service Unavailable response. They respect the Retry-After value if it is set in the response header. This is useful when performing short maintenance on the server without causing an interruption for the users.
The scheduled tasks now show when they are running and prevent the user from editing them (as they are locked anyway).
We also store their last duration for a month by default. So the administrator can analyze and find slow tasks.
It is now possible to configure a license key for the TinyMCE editor.
Also TinyMCE has been updated to version 7.
It is now possible to configure the command to use to convert a report to a different format. This allows the use of an external service like document-converter.
AccountingThe Accounting Party group has been merged into the *Accounting" group.
We now raise a warning when the user is changing one of the configured credentials used on external services. This is to prevent accidental modification.
Document IncomingIt is now possible to set a maximum size for the content of the document incoming requests.
Inbound EmailIt is now possible to set a maximum size for the inbound email requests.
Web ShopThere is now a scheduled task that updates the cache that contains the product data feeds.
Changes for the Developer ServerThe ORM supports SQL Range functions and operators to build exclusion constraints. This allows, for example, the use of non-overlapping constraints using an index.
On PostgreSQL the btree_gist extension may be needed otherwise the ORM will fallback to locking querying the table.
The SQLite backend adds simple SQL constraints to the table schema.
The relational fields with a filter are no longer copied by default. This was a frequent source of bugs as the same relational field without the filter was already copied so it generated duplicates.
We’ve added a sparkline tool to generate textual sparklines. This allows the removal of the pygal dependency.
The activate_modules from testing now accepts a list of setup methods that are run before taking the backup. This speeds up any other tests which restore the backup as they then do not need to run those setup methods.
The backend now has a method to estimate the number of rows in a table. This is faster than counting when we only need an estimate, for example when choosing between a join and a sub-query.
We’ve added a ModelSQL.__setup_indexes__ method that prepares the indexes once the Pool has been loaded.
It is now possible to generate many sequential numbers in a single call. This allows, for example, to number a group of invoices with a single call.
The backend now uses JSONB by default for MultiSelection fields. It was already supported, but the database needed to be altered to activate the feature.
You can now define the cardinality (low, normal or high) for the index usage. This allows the backend to choose an optimal type of index to create.
We now have tools that apply the typing to columns of an SQLite query. This is needed because SQLite doesn’t do a good job of supporting CAST.
The RPC responses are now compressed if their size if large enough and the client accepts it.
The ModelView._changed_values and ModelStorage._save_values are now methods instead of properties. This makes it is easier to debug errors because AttributeError exceptions are no longer hidden.
The scheduled task runner now uses a pool of processes for better parallelism and management. Only the running task is now locked.
We’ve added an environment variable TEST_NETWORK so we can avoid running tests that require network access.
There is now a command line option for exporting translations and storing them as a po file in the corresponding module.
Tryton sets the python-format flag in the po file for the translations containing python formats. This allows Weblate (our translation service) to check if the translations keep the right placeholders.
The payment amounts are now cached on the account move line to improve the performance when searching for lines to pay.
The payment amounts now have to be greater or equal to zero.
Only purchase lines of type line can be used as an origin for a stock move.
SaleOnly sales lines of type line can be used as an origin for a stock move.
The fields from the Sale Shipment Cost Module are now all prefixed with sale_.
StockCancelled moves are no longer included in the shipment and package measurements.
2 posts - 1 participant
Django Weblog: Django bugfix release issued: 5.1.3
Today we've issued the 5.1.3 bugfix release.
The release package and checksums are available from our downloads page, as well as from the Python Package Index. The PGP key ID used for this release is Mariusz Felisiak: 2EF56372BA48CD1B.
KDE Plasma 6.2.3, Bugfix Release for November
Tuesday, 5 November 2024. Today KDE releases a bugfix update to KDE Plasma 6, versioned 6.2.3.
Plasma 6.2 was released in October 2024 with many feature refinements and new modules to complete the desktop experience.
This release adds two weeks' worth of new translations and fixes from KDE's contributors. The bugfixes are typically small but important and include:
- Bluedevil: Correct PIN entry behavior. Commit.
- KWin: Backends/drm: don't set backlight brightness to 1 in HDR mode. Commit. Fixes bug #495242
- KDE GTK Config: Gracefully handle decoration plugin failing to load. Commit.
Talking Drupal: Talking Drupal #474 - Revolt Event Loop
Today we are talking about the revolt event Loop, what it is, and why it matters with guest Alexander Varwijk (farvag). We’ll also cover IEF Complex Widget Dialog as our module of the week.
For show notes visit: https://www.talkingDrupal.com/474
Topics- What is an event loop
- Why does Drupal need an event loop
- What will change in core to implement this
- What problem does this solve
- Does this make Cron cleaner and long running processes faster
- What impact will this have on contrib
- How would contrib use this loop
- What does this mean for database compatibility
- What inspired this change
- Test instability
- Why Revolt
- Will this help with Drupal AI
- Adopt the Revolt event loop for async task orchestration
- revoltphp/event-loop was added as a dependency to Drupal Core
- Add "EventLoop::run" to Drupal Core
- Migrate BigPipe and the Renderer code that's currently built with fibers
- Revolt Playground that shows converting some Fiber implementations from Drupal to the Event Loop
- DrupalCon Barcelona Talk about "Why Async Drupal a Big Deal Is"
- Async PHP libraries
Alexander Varwijk - alexandervarwijk.com Kingdutch
HostsNic Laflin - nLighteneddevelopment.com nicxvan John Picozzi - epam.com johnpicozzi Joshua "Josh" Mitchell - joshuami.com joshuami
MOTW CorrespondentMartin Anderson-Clutz - mandclu.com mandclu
- Brief description:
- Have you ever wanted to use Inline Entity Forms but have the dependent form open in a dialog? There’s a module for that.
- Module name/project name:
- Brief history
- How old: created in Mar 2020 by dataweb, though recent releases are by Chris Lai (chrisck), a fellow Canadian
- Versions available: 2.1.1 and 2.2.2, the latter or which is compatible Drupal 8.8 or newer, all the way up to Drupal 11
- Maintainership
- Actively maintained, latest release in the past month
- Number of open issues: 4 open issues, none of which are bugs against the current version
- Usage stats:
- 273 sites
- Module features and usage
- When you install the module, your Inline Entity Form widget configuration will have a new checkbox, to “Enable Popup for IEF”
- Includes specialized handling for different kinds of entities, like nodes, users, taxonomy terms, and users
- Will handle not just the creation forms, but editing entities, and also duplicating or deleting entities
- Not something you would always need, but can be very useful if the form you want to use for entity or even parent forms that are complex
- I should also add that IEF supports form modes, so often I’ll create an “embedded” form mode that exposes fewer elements, for example hiding the fields for URL alias, sticky, and so on. So I would start there, but if the content creation experience still feels complex, then IEF Complex Widget Dialog might be a nice way to help
Ravi Dwivedi: Asante Kenya for a Good Time
In September of this year, I visited Kenya to attend the State of the Map conference. I spent six nights in Nairobi, two nights in Mombasa, and one night on a train. I was very happy with the visa process being smooth and quick. Furthermore, I stayed at the Nairobi Transit Hotel with other attendees, with Ibtehal from Bangladesh as my roommate. One of the memorable moments was the time I spent at a local coffee shop nearby. We used to go there at midnight, despite the grating in the shops suggesting such adventures were unsafe. Fortunately, nothing bad happened, and we were rewarded with a fun time with the locals.
The coffee shop Ibtehal and me used to visit during the midnight Grating at a chemist shop in Mombasa, KenyaThe country lies on the equator, which might give the impression of extremely hot temperatures. However, Nairobi was on the cooler side (10–25 degrees Celsius), and I found myself needing a hoodie, which I bought the next day. It also served as a nice souvenir, as it had an outline of the African map printed on it.
I bought a Safaricom SIM card for 100 shillings and recharged it with 1000 shillings for 8 GB internet with 5G speeds and 400 minutes talk time.
A visit to Nairobi’s Historic Cricket GroundOn this trip, I got a unique souvenir that can’t be purchased from the market—a cricket jersey worn in an ODI match by a player. The story goes as follows: I was roaming around the market with my friend Benson from Nairobi to buy a Kenyan cricket jersey for myself, but we couldn’t find any. So, Benson had the idea of visiting the Nairobi Gymkhana Club, which used to be Kenya’s main cricket ground. It has hosted some historic matches, including the 2003 World Cup match in which Kenya beat the mighty Sri Lankans and the record for the fastest ODI century by Shahid Afridi in just 37 balls in 1996.
Although entry to the club was exclusively for members, I was warmly welcomed by the staff. Upon reaching the cricket ground, I met some Indian players who played in Kenyan leagues, as well as Lucas Oluoch and Dominic Wesonga, who have represented Kenya in ODIs. When I expressed interest in getting a jersey, Dominic agreed to send me pictures of his jersey. I liked his jersey and collected it from him. I gave him 2000 shillings, an amount suggested by those Indian players.
Me with players at the Nairobi Gymkhana Club Cricket pitch at the Nairobi Gymkhana Club A view of the cricket ground inside the Nairobi Gymkhana Club Scoreboard at the Nairobi Gymkhana cricket ground Giraffe Center in NairobiKenya is known for its safaris and has no shortage of national parks. In fact, Nairobi is the only capital in the world with a national park. I decided not to visit a national park, as most of them were expensive and offered multi-day tours, and I didn’t want to spend that much time in the wildlife.
Instead, I went to the Giraffe Center in Nairobi with Pragya and Rabina. The ticket cost 1500 Kenyan shillings (1000 Indian rupees). In Kenya, matatus - shared vans, usually decorated with portraits of famous people and play rap songs - are the most popular means of public transport. Reaching the Giraffe Center from our hotel required taking five matatus, which cost a total of 150 shillings, and a 2 km walk. The journey back was 90 shillings, suggesting that we didn’t find the most efficient route to get there. At the Giraffe Center, we fed giraffes and took photos.
A matatu with a Notorious BIG portrait. Inside the Giraffe Center Train ride from Nairobi to MombasaI took a train from Nairobi to Mombasa. The train is known as the “SGR Train,” where “SGR” refers to “Standard Gauge Railway.” The journey was around 500 km. M-Pesa was the only way to make payment for pre-booking the train ticket, and I didn’t have an M-Pesa account. Pragya’s friend Mary helped facilitate the payment. I booked a second-class ticket, which cost 1500 shillings (1000 Indian rupees).
The train was scheduled to depart from Nairobi at 08:00 hours in the morning and arrive in Mombasa at 14:00 hours. The security check at the station required scanning our bags and having them sniffed by sniffer dogs. I also fell victim to a scam by a security official who offered to help me get my ticket printed, only to later ask me to get him some coffee, which I politely declined.
Before boarding the train, I was treated to some stunning views at the Nairobi Terminus station. It was a seating train, but I wished it were a sleeper train, as I was sleep-deprived. The train was neat and clean, with good toilets. The train reached Mombasa on time at around 14:00 hours.
SGR train at Nairobi Terminus. Interior of the SGR train Arrival in Mombasa Mombasa Terminus station.Mombasa was a bit hotter than Nairobi, with temperatures reaching around 30 degrees Celsius. However, that’s not too hot for me, as I am used to higher temperatures in India. I had booked a hostel in the Old Town and was searching for a hitchhike from the Mombasa Terminus station. After trying for more than half an hour, I took a matatu that dropped me 3 km from my hostel for 200 shillings (140 Indian rupees). I tried to hitchhike again but couldn’t find a ride.
I think I know why I couldn’t get a ride in both the cases. In the first case, the Mombasa Terminus was in an isolated place, so most of the vehicles were taxis or matatus while any noncommercial cars were there to pick up friends and family. If the station were in the middle of the city, there would be many more car/truck drivers passing by, thus increasing my possibilities of getting a ride. In the second case, my hostel was at the end of the city, and nobody was going towards that side. In fact, many drivers told me they would love to give me a ride, but they were going in some other direction.
Finally, I took a tuktuk for 70 shillings to reach my hostel, Tulia Backpackers. It was 11 USD (1400 shillings) for one night. The balcony gave a nice view of the Indian Ocean. The rooms had fans, but there was no air conditioning. Each bed also had mosquito nets. The place was walking distance of the famous Fort Jesus. Mombasa has had more Islamic influence compared to Nairobi and also has many Hindu temples.
The balcony at Tulia Backpackers Hostel had a nice view of the ocean. A room inside the hostel with fans and mosquito nets on the beds Visiting White Sandy Beaches and Getting a HitchhikeVisiting Nyali beach marked my first time ever at a white sand beach. It was like 10 km from the hostel. The next day, I visited Diani Beach, which was 30 km from the hostel. Going to Diani Beach required crossing a river, for which there’s a free ferry service every few minutes, followed by taking a matatu to Ukunda and then a tuk-tuk to Diani Beach. This gave me an opportunity to see the beautiful countryside during the ride.
Nyali beach is a white sand beach This is the ferry service for crossing the river.During my return from Diani Beach to the hostel, I was successful in hitchhiking. However, it was only a 4 km ride and not sufficient to reach Ukunda, so I tried to get another ride. When a truck stopped for me, I asked for a ride to Ukunda. Later, I learned that they were going in the same direction as me, so I got off within walking distance from my hostel. The ride was around 30 km. I also learned the difference between a truck ride and a matatu or car ride. For instance, matatus and cars are much faster and cooler due to air conditioning, while trucks tend to be warmer because they lack it. Further, the truck was stopped at many checkpoints by the police for inspections as it carried goods, which is not the case with matatus. Anyways, it was a nice experience, and I am grateful for the ride. I had a nice conversation with the truck drivers about Indian movies and my experiences in Kenya.
Diani beach is a popular beach in Kenya. It is a white sand beach. Selfie with truck drivers who gave me the free ride Back to NairobiI took the SGR train from Mombasa back to Nairobi. This time I took the night train, which departs at 22:00 hours, reaching Nairobi at around 04:00 in the morning. I could not sleep comfortably since the train only had seater seats.
I had booked the Zarita Hotel in Nairobi and had already confirmed if they allowed early morning check-in. Usually, hotels have a fixed checkout time, say 11:00 in the morning, and you are not allowed to stay beyond that regardless of the time you checked in. But this hotel checked me in for 24 hours. Here, I paid in US dollars, and the cost was 12 USD.
Almost Got Stuck in KenyaTwo days before my scheduled flight from Nairobi back to India, I heard the news that the airports in Kenya were closed due to the strikes. Rabina and Pragya had their flight back to Nepal canceled that day, which left them stuck in Nairobi for two additional days. I called Sahil in India and found out during the conversation that the strike was called off in the evening. It was a big relief for me, and I was fortunate to be able to fly back to India without any changes to my plans.
Newspapers at a stand in Kenya covering news on the airport closure Experience with localsI had no problems communicating with Kenyans, as everyone I met knew English to an extent that could easily surpass that of big cities in India. Additionally, I learned a few words from Kenya’s most popular local language, Swahili, such as “Asante,” meaning “thank you,” “Jambo” for “hello,” and “Karibu” for “welcome.” Knowing a few words in the local language went a long way.
I am not sure what’s up with haggling in Kenya. It wasn’t easy to bring the price of souvenirs down. I bought a fridge magnet for 200 shillings, which was the quoted price. On the other hand, it was much easier to bargain with taxis/tuktuks/motorbikes.
I stayed at three hotels/hostels in Kenya. None of them had air conditioners. Two of the places were in Nairobi, and they didn’t even have fans in the rooms, while the one in Mombasa had only fans. All of them had good Wi-Fi, except Tulia where the internet overall was a bit shaky.
My experience with the hotel staff was great. For instance, we requested that the Nairobi Transit Hotel cancel the included breakfast in order to reduce the room costs, but later realized that it was not a good idea. The hotel allowed us to revert and even offered one of our missing breakfasts during dinner.
The staff at Tulia Backpackers in Mombasa facilitated the ticket payment for my train from Mombasa to Nairobi. One of the staff members also gave me a lift to the place where I could catch a matatu to Nyali Beach. They even added an extra tea bag to my tea when I requested it to be stronger.
FoodAt the Nairobi Transit Hotel, a Spanish omelet with tea was served for breakfast. I noticed that Spanish omelette appeared on the menus of many restaurants, suggesting that it is popular in Kenya. This was my first time having this dish. The milk tea in Kenya, referred to by locals as “white tea,” is lighter than Indian tea (they don’t put a lot of tea leaves).
Spanish Omelette served in breakfast at Nairobi Transit HotelI also sampled ugali with eggs. In Mombasa, I visited an Indian restaurant called New Chetna and had a buffet thali there twice.
Ugali with eggs. Tips for Exchanging MoneyIn Kenya, I exchanged my money at forex shops a couple of times. I received good exchange rates for bills larger than 50 USD. For instance, 1 USD on xe.com was 129 shillings, and I got 128.3 shillings per USD (a total of 12,830 shillings) for two 50 USD notes at an exchange in Nairobi, compared to 127 shillings, which was the highest rate at the banks. On the other hand, for each 1 USD note, I would have received an exchange rate of 125 shillings. A passport was the only document required for the exchange, and they also provided a receipt.
A good piece of advice for travelers is to keep 50 USD or larger bills for exchanging into the local currency while saving the smaller US dollar bills for accommodation, as many hotels and hostels accept payment in US dollars.
Missed Malindi and LamuThere were more places on my to-visit list in Kenya. But I simply didn’t have time to cover them, as I don’t like rushing through places, especially in a foreign country where there is a chance of me underestimating the amount of time it takes during transit. I would have liked to visit at least one of Kilifi, Watamu or Malindi beaches. Further, Lamu seemed like a unique place to visit as it has no cars or motorized transport; the only options for transport are boats and donkeys.
Python Engineering at Microsoft: Announcing GitHub Copilot in Data Wrangler
AI did not write this blog post, but it will make your exploratory data analysis with Data Wrangler better!
Today, we’re excited to introduce our first step of integrating the power of Copilot into Data Wrangler.
With this first integration of Copilot with Data Wrangler, you’ll be able to:
- Use natural language to clean and transform your data
- Get help with fixing errors in your data transformation code
An example of using Copilot in Data Wrangler to filter for listings that allow dogs/cats
A common limitation of using AI tools for exploratory data analysis tasks today is the lack of data context provided to the AI. Responses are typically more generalized and not tailored to the specific task or data at hand. In addition, there’s always the manual and tedious task of verifying the correctness of the generated code.
What makes Copilot with Data Wrangler different is twofold. First, this integration allows you to choose to provide Copilot with your data context, enabling it to generate more relevant and specific code for the exact dataset you have open. Second, you get to preview the exact behavior of the code on your dataset with the Data Wrangler interface to visually validate Copilot’s response, along with all the benefits that the Data Wrangler tool provides.
Data transformationsWith Copilot in Data Wrangler, you can ask it to perform ambiguous, open-ended transformations or a specific task you have in mind. Below we’ve included three examples of the many possibilities you can achieve with Copilot in Data Wrangler:
Formatting a datetime column
Removing any column(s) with over 40% missing values
Fixing an error in a data transformation
Getting started todayTo use Copilot with Data Wrangler, you will need the following 3 prerequisites.
- You must have the Data Wrangler extension for VS Code installed.
- You must have the GitHub Copilot extension for VS Code installed.
- You must have an active subscription for GitHub Copilot in your personal account, or you need to be assigned a seat by your organization. Sign up for a GitHub Copilot free trial in your personal account.
Follow these steps to Set up GitHub Copilot in VS Code.
Once the prerequisites are met, you will see the Copilot interface within Data Wrangler by default (customizable in the Data Wrangler settings) when you are in Editing Mode. You can then either select the input box or use the default Copilot keyboard shortcut of CMD/CTRL + I.
Responsible AIAI is not perfect (neither are we!) and it will improve over time. Microsoft and GitHub Copilot follow Responsible AI principles and employ controls to ensure that your experience with the service is appropriate, pleasant, and useful. We understand there is hesitation and concern surrounding the rapid expansion of AI’s capabilities, and fully respect those who don’t want or can’t use Copilot.
If you have any feedback around the Copilot experience in Data Wrangler, please file an issue in our Data Wrangler public GitHub repository here.
Next StepsWe are just getting started. This is the first experience in Data Wrangler that we are enhancing with Copilot. Stay tuned for more AI-powered experiences in Data Wrangler to help with your data analysis needs soon!
The post Announcing GitHub Copilot in Data Wrangler appeared first on Python.
Tag1 Consulting: Migrating your Data from D7 to D10: Configuring text formats, editors and user roles
In the previous article, we learned to apply Drupal recipes to add configuration to our Drupal 10 site. In this article, we will continue this process to bring in more configuration related to text formats and editors, user roles, and user fields.
mauricio Mon, 11/04/2024 - 06:00Real Python: Variables in Python: Usage and Best Practices
In Python, variables are symbolic names that refer to objects or values stored in your computer’s memory. They allow you to assign descriptive names to data, making it easier to manipulate and reuse values throughout your code.
Understanding variables is key for Python developers because variables are essential building blocks for any Python program. Proper use of variables allows you to write clear, readable, and maintainable code.
In this tutorial, you’ll learn how to:
- Create and assign values to variables
- Change a variable’s data type dynamically
- Use variables to create expressions, counters, accumulators, and Boolean flags
- Follow best practices for naming variables
- Create, access, and use variables in their scopes
To get the most out of this tutorial, you should be familiar with Python’s basic data types and have a general understanding of programming concepts like loops and functions.
Don’t worry if you don’t have all this knowledge yet and you’re just getting started. You won’t need this knowledge to benefit from working through the early sections of this tutorial.
Get Your Code: Click here to download the free sample code that shows you how to use variables in Python.
Take the Quiz: Test your knowledge with our interactive “Variables in Python: Usage and Best Practices” quiz. You’ll receive a score upon completion to help you track your learning progress:
Interactive Quiz
Variables in Python: Usage and Best PracticesIn this quiz, you'll test your understanding of variables in Python. Variables are symbolic names that refer to objects or values stored in your computer's memory, and they're essential building blocks for any Python program.
Getting to Know Variables in PythonIn Python, variables are names associated with concrete objects or values stored in your computer’s memory. By associating a variable with a value, you can refer to the value using a descriptive name and reuse it as many times as needed in your code.
Variables behave as if they were the value they refer to. To use variables in your code, you first need to learn how to create them, which is pretty straightforward in Python.
Creating Variables With AssignmentsThe primary way to create a variable in Python is to assign it a value using the assignment operator and the following syntax:
Python Syntax variable_name = value Copied!In this syntax, you have the variable’s name on the left, then the assignment (=) operator, followed by the value you want to assign to the variable at hand. The value in this construct can be any Python object, including strings, numbers, lists, dictionaries, or even custom objects.
Note: To learn more about assignments, check out Python’s Assignment Operator: Write Robust Assignments.
Here are a few examples of variables:
Python >>> word = "Python" >>> number = 42 >>> coefficient = 2.87 >>> fruits = ["apple", "mango", "grape"] >>> ordinals = {1: "first", 2: "second", 3: "third"} >>> class SomeCustomClass: pass >>> instance = SomeCustomClass() Copied!In this code, you’ve defined several variables by assigning values to names. The first five examples include variables that refer to different built-in types. The last example shows that variables can also refer to custom objects like an instance of your SomeCustomClass class.
Setting and Changing a Variable’s Data TypeApart from a variable’s value, it’s also important to consider the data type of the value. When you think about a variable’s type, you’re considering whether the variable refers to a string, integer, floating-point number, list, tuple, dictionary, custom object, or another data type.
Python is a dynamically typed language, which means that variable types are determined and checked at runtime rather than during compilation. Because of this, you don’t need to specify a variable’s type when you’re creating the variable. Python will infer a variable’s type from the assigned object.
Note: In Python, variables themselves don’t have data types. Instead, the objects that variables reference have types.
For example, consider the following variables:
Python >>> name = "Jane Doe" >>> age = 19 >>> subjects = ["Math", "English", "Physics", "Chemistry"] >>> type(name) <class 'str'> >>> type(age) <class 'int'> >>> type(subjects) <class 'list'> Copied! Read the full article at https://realpython.com/python-variables/ »[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]
Golems GABB: Best Practices for REST APIs in Drupal 11
Are you worried about how to make your Drupal REST APIs efficient and secure yet fulfil today's needs? As Drupal 11 looms on the horizon, both developers and Drupal website owners are looking forward to using its benefits to create robust APIs.
However, with much power comes great responsibility, and figuring out the best methods for creating REST API can seem very difficult. Are you prepared to use its full capability? In this article, the Golems company delves into the best practices for REST APIs in Drupal 11.
Sven Hoexter: Google CloudDNS HTTPS Records with ipv6hint
I naively provisioned an HTTPS record at Google CloudDNS like this via terraform:
resource "google_dns_record_set" "testv6" { name = "testv6.some-domain.example." managed_zone = "some-domain-example" type = "HTTPS" ttl = 3600 rrdatas = ["1 . alpn=\"h2\" ipv4hint=\"198.51.100.1\" ipv6hint=\"2001:DB8::1\""] }This results in a permanent diff because the Google CloudDNS API seems to parse the record content, and stores the ipv6hint expanded (removing the :: notation) and in all lowercase as 2001:db8:0:0:0:0:0:1. Thus to fix the permanent diff we've to use it like this:
resource "google_dns_record_set" "testv6" { name = "testv6.some-domain.example." managed_zone = "some-domain-example" type = "HTTPS" ttl = 3600 rrdatas = ["1 . alpn=\"h2\" ipv4hint=\"198.51.100.1\" ipv6hint=\"2001:db8:0:0:0:0:0:1\""] }Guess I should be glad that they already support HTTPS records natively, and not bicker too much about the implementation details.
Robin Wilson: Join the GeoTAM hackathon to work out business turnovers!
Summary: I’m involved in organising a hackathon, and I’d love you to take part. The open-source GeoTAM hackathon focuses on estimating turnover for individual business locations in the UK, from a variety of open datasets. Please checkout the hackathon page and sign up. There are prizes of up to £2,000!
(Click image for a larger version)
I’m currently working with Rebalance Earth, a boutique asset manager who are focused on making nature an investable asset. Our aim is to mobilise investment in UK natural infrastructure – for example, by arranging investment to undertake river restoration and reduce the risk of flooding. We will do this by finding businesses at risk of flooding, designing restoration schemes that will reduce this risk, and setting up ‘Nature-as-a-Service’ contracts with businesses to pay for the restoration.
I’m the Lead Geospatial Developer at Rebalance Earth, and am leading the development of our Geospatial Predictive Analytics Platform (GPAP), which helps us assess businesses at risk of flooding and design schemes to reduce this flooding.
An important part of deciding which areas to focus on is estimating the total business value at risk from flooding. A good way of establishing this is to use an estimate of the business turnover. However, there are no openly-available datasets showing business turnover in the UK – which is where the hackathon comes in.
We’re looking for participants to bring their expertise in programming, data science, machine learning and more to take some datasets we provide, combine them with other open data and try and estimate turnover. Specifically, we’re interested in turnover of individual business locations – for example, the turnover of a specific supermarket, not the whole supermarket chain.
The hackathon runs from 20th – 26th November 2024. We’ll provide some datasets, some ideas, and a Discord server to communicate through. We’d like you to bring your expertise and see what you can produce. This is a tricky task, and we’re not expecting fully polished solutions; proof-of-concept solutions are absolutely fine. You can enter as a team or an individual.
Most importantly, there are prizes:
- £2,000 for the First Prize
- £1,000 for the Second Prize
- £500 for the Third Prize
and there’s a possibility that we might even hire you to continue work on your idea!
So, please sign up and tell your friends!
C++20 comparison in Qt (even with C++17🤩)
In the Qt 6.7 release, we enabled support for C++20 comparison and also back-ported some of its features to C++17. This blog post will give you an overview of the comparison enhancements we are taking advantage of and offer guidance on implementing them in your custom classes.
Python Bytes: #408 python-preference only-managed 3.13t
Zato Blog: Meaningful automation in Python
This article is an introduction to meaningful automation, integrations and interoperability with Zato, service-oriented thinking and Python.
Zato is a convenient and secure, Python-based, open-source, service-oriented platform for automation, integrations and interoperability. It is used to connect distributed systems or data sources and to build API-focused, middleware and backend applications.
The platform is designed and built specifically with Python users in mind - often working in, and for, industries such as telecommunications, defense, health care and others that require automation, integrations and interoperability of multiple systems and processes.
Sample real-world, mission-critical Zato environments include:
-
Systems for telecommunication operators integrating CRM, ERP, Charging Systems, Billing and other OSS/BSS applications internal or external to the operators, including network automation of packet brokers and other network visibility and cybersecurity tools from Keysight
-
Enterprise services buses for government, helping in the digital transformation of legacy systems and processes towards modern capabilities
-
AI, ML and data science systems that analyze and improve acquisition and supply chain activities in government processes
-
Applied observability automation that enables meaningful decision making through the orchestration and coordination of the collection, distribution and presentation of data spread across a pool of independent systems
-
Platforms for health care and public administration systems, helping to achieve data interoperability through the integration of independent data sources, databases and health information exchanges (HIE)
-
Global IoT platforms for hybrid integrations of medical devices and software both in the cloud and on premises
-
Cybersecurity automation, including IT/OT hardware and software assets
-
Robotic process automation (RPA) of message flows and events produced by early warning systems
Zato offers connectors to all the popular technologies and vendors, such as REST, Cloud, task scheduling, Microsoft 365, Salesforce, Atlassian, SAP, Odoo, SQL, HL7, FHIR, AMQP, IBM MQ, LDAP, Redis, MongoDB, WebSockets, SOAP, Caching and many more.
Running in the cloud, on premises, or under Docker, Kubernetes and other container technologies, Zato services are optimized for high performance and security - it is easily possible to run hundreds and thousands of services on typical server instances as offered by Amazon, Google Cloud, Azure or other cloud providers.
Zato servers offer high availability and no-downtime deployment. Servers form clusters that are used to scale systems both horizontally and vertically.
The product is commercial open-source software with training, professional services and enterprise 24x7x365 support available.
A platform and language for interesting, reusable and atomic servicesZato promotes the design of, and helps you build, solutions composed of services that are interesting, reusable and atomic (IRA).
What does it really mean in practice that something is interesting, reusable and atomic? In particular, how do we define what is interesting?
Each interesting service should make its users want to keep using it more and more. People should immediately see the value of using the service in their processes. An interesting service strikes everyone as immediately useful in wider contexts, preferably with few or no conditions, prerequisites and obligations.
An interesting service is aesthetically pleasing, both in terms of its technical usage as well as its relevance to, and potential applicability in, fields broader than originally envisioned. If people check the service and say "I know, we will definitely use it" or "Why don't we use it" you know that the service is interesting. If they say "Oh no, not this one again" or "No, thanks, but no" then it is the opposite.
Note that focus here is on the value that the service brings for the user. You constantly need to keep in mind that people generally want to use services only if they allow them to fulfill their plans or execute some bigger ideas. Perhaps they already have them in mind and they are only looking for technical means of achieving that or perhaps it is your services that will make a person realize that something is possible at all, but the point is the same, your service should serve a grander purpose.
This mindset, of wanting to build things that are useful and interesting is not specific to Python or, indeed, to software and technology. Even if you are designing and implementing services for your own purposes, you need to act as if you were a consultant that can always see a bigger vision, a bigger architecture, and who can envision results that are still ahead in the future while at the same time not forgetting that it is always a series of small interesting actions, that everyone can relate to, that lead to success.
A curious observation can be made, particularly when you consider all the various aspects of the digital transformation that companies and organizations go through, is that many people to whom the services are addressed, or who sponsor their development, are surprised when they see what automation and integrations are capable.
Put differently, many people can only begin to visualize bigger designs once they see in practice smaller, practical results that further their missions, careers and otherwise help them at work. This is why, again, the focus on being interesting is essential.
At the same, it can be at times advantageous to you that people will not see automation or integrations coming. That lets you take the lead and build a center of such a fundamental shift around yourself. This is a great position to be in, a blue ocean of possibilities, because it means little to no competition inside an organization that you are a part of.
If you are your own audience, that is, if you build services for your own purposes, the same principles apply and it is easy to observe that thinking in services lets you build a toolbox of reusable, complementary capabilities, a portfolio, that you can take with you as you progress in your career. For instance, your services, and your work, can concentrate on a particular vendor and with a set of services that automate their products, you will be always able to put that into use, shorting your own development time, no matter who employs you and in what way.
Regardless of who the clients that you build the solutions for are, observe that automation and integrations with services are evolutionary and incremental in their nature, at least initially. Yes, the resulting value can often be revolutionary but you do not intend to incur any massive changes until there are clear, interesting results available. Trying to integrate and change existing systems at the same time is doable, but not trivial, and it is best left to later stages, once your automation gets the necessary buy-in from the organization.
Services should be ready to be used in different, independent processes. The processes can be strictly business ones, such as processing of orders or payments, or they can be of a deep, technical nature, e.g. automating cybersecurity hardware. What matters in either case is that reusability breeds both flexibility and stability.
There is inherent flexibility in being able to compose bigger processes out of smaller blocks with clearly defined boundaries, which can easily translate to increased competitive advantage when services are placed into more and more areas. A direct result of this is a reduction in R&D time as, over time, you are able to choose from a collection of loosely-coupled components, the services, that hide implementation details of a particular system or technology that they automate or integrate with.
Through their continued use in different processes, services can reduce overall implementation risks that are always part of any kind of software development - you will know that you can keep reusing stable functionality that has been already well tested and that is used elsewhere.
Because services are reusable, there is no need for gigantic, pure waterfall-style implementations of automation and integrations in an organization. Each individual project can contribute a smaller set of services that, as a whole, constitute the whole integrated environment. Conversely, each new project can start to reuse services delivered by the previous ones, hence allowing you to quickly, incrementally, prove the value of the investment in service-oriented thinking.
To make them reusable, services are designed in a way that hides their implementation details. Users only need to know how to invoke the service; the specific systems or processes it automates or integrates are not necessarily important for them to know as long as a specific business goal is achieved. Thanks to that, both services and what they integrate can be replaced without disrupting other parts - in reality, this is exactly what happens - systems with various kinds of data will be changed or modernized but the service will stay the same and the user will not notice anything.
Each service fulfills a single, atomic business need. Each service is deployed independently and, as a whole, they constitute an implementation of business processes taking place in your company or organization. Note that the definition of what the business need is, again, specific to your own needs. In purely market-oriented integrations, this may mean, for instance, the opening of a bank account. In IT or OT automation, on the other hand, it may mean the reconfiguration of a specific device.
That services are atomic also means that they are discrete and that their functionality is finely grained. You will recognize whether a design goes in this direction if consider the names of the services for a moment. An atomic service will invariably use a short name consisting of a single verb and noun. For instance, "Create Customer Account", "Stop Firewall", "Conduct Feasibility Study", it is easy to see that we cannot break them down into smaller part, they are atomic.
At the same time, you will keep creating composite services that invoke other services; this is natural and as expected but you will not consider services such as "Create Customer Account and Set Up a SIM Card" as atomic ones because, in that form, they will not be very reusable, and a major part of why being atomic is important is that it promotes reusability. For instance, having separate services to create customer accounts, independently of setting up their SIM cards, is that one can without difficulty foresee situations when an account is created but a SIM card is purchased at a later time and, conversely, surely one customer account should be able to potentially have multiple SIM cards. Think of it as being similar to LEGO bricks where just a few basic shapes can form millions of interesting combinations.
The point about service naming conventions is well worth remembering because this lets you maintain a vocabulary that is common to both technical and business people. A technical person will understand that such naming is akin to the CRUD convention from the web programming world while a business person will find it easy to map the meaning to a specific business function within a broader business process.
With Zato, you use Python to focus on the business logic exclusively and the platform takes care of scalability, availability communications protocols, messaging, security or routing. This lets you concentrate only on what is the very core of systems integrations - making sure their services are interesting, reusable and atomic.
Python is the perfect choice for this job because it hits the sweet spot under several key headings:
-
It is a very high level language, with a syntax close to how grammar of various spoken languages works, which makes it easy to translate business requirements into implementation
-
It is a solid, mainstream and full-featured, real programming language rather than a domain-specific one which means that it offers a great degree of flexibility and choice in expressing their needs
-
It is difficult to find universities without Python courses. Most people entering the workforce already know Python, it is a new career language. In fact, it is becoming more and more difficult to find new talent who would not prefer to use Python.
-
Yet, one does not need to be a developer or a full-time programmer to use Python. In fact, most people who use Python are not programmers at all. They are specialists in other fields who also need to use a programming language to automate or integrate their work in a meaningful way.
-
Many Python users come from backgrounds in network and cybersecurity engineering - fields that naturally require a lot of automation using a real language that is convenient and easy to get started with
-
Many Python users are scientists with a background in AI, ML and data science, applying their domain-specific knowledge in processes that, by their very nature, require them to collect and integrate data from independent sources, which again leads to automation and integrations
-
Many Python users have a strong web programming background which means that it takes little effort to take a step further, towards automation and integrations. In turn, this means that it is easy to find good people for API projects.
-
Many Python users know multiple programming languages - this is very useful in the context of integration projects where one is typically faced with dozens of technologies, vendors or integration methods and techniques.
-
Lower maintenance costs - thanks to the language's unique design, Python programmers tend to produce code that is easy to read and understand. From the perspective of multi-year maintenance, reading and analyzing code, rather than writing it, is what most programmers do most of the time, making sense to use a language that makes it easy to carry out the most common tasks.
In short, Python can be construed as executable pseudo-code with many of its users already having roots in modern server-side programming so Zato, both from a technical and strategic perspective, is a natural choice for both simple and complex, sophisticated automation, integration and interoperability solutions as a platform built in the language and designed for Python people from day one.
Next steps:➤ Read about how to use Python to build and integrate enterprise APIs that your tests will cover
➤ Python API integration tutorial
➤ Python Integration platform as a Service (iPaaS)
➤ What is an Enterprise Service Bus (ESB)? What is SOA?
James Bennett: Three Django wishes
Michael Foord: Python Knowledge Sharing Videos Online
I’ve been teaching Python in one hour knowledge sharing sessions, some of which I’ve put online on youtube.
This is the link to the playlist of the sessions:
The slides for each of the sessions, along with some example code, can be found in this github repository:
So far there are seven one-hour sessions (with more planned) on:
- Python Core Object Model
- Python objects
- Slots
- Attribute lookup and the MRO
- Inheritance, multiple inheritance and super
- Inside Python objects and classes
- Closures and decorators (functional programming)
- Functional programming: higher order functions and functions as objects
- Lambdas
- Closures: functions that build functions
- Variable scoping: global, local and nonlocal
- Decorators: functions wrapping functions
- Decorator factories (decorators that take arguments)
- Class decorators
- Decorator order and using functools.wraps
- Generators and Iterators
- The iteration protocol
- Stateful iteration with generators
- Adding iteration support to objects
- References, assignment and mutability
- Identity versus equality
- Call by object
- Object copying
- Unicode, Floats and regex
- Floating point numbers
- Unicode, encodings and strings
- Regular expressions
- Concurrency (async, threads, processes, the GIL)
- The history of concurrency from AmigaOS to a multi-core world
- Python and the Global Interpreter Lock
- I/O bound and CPU bound tasks
- Threads and processes
- Async programming (green threading, coroutines)
- Concurrency with threads
- Concurrency with multiprocessing
- Looking to the future (Python 3.13): optional GIL (PEP 703) and subinterpreters (PEP 554)
- Testing with pytest
- virtual environments and pipenv (installing pytest)
- pytest command line for collecting and running tests
- Simple test functions and asserts
- Test fixtures and conftest.py
- Testing exceptions
- Test parameterisation for test combinations
- Test marking for running test subsets
- Principles of testing (unit tests versus end to end testing, building test helpers etc)
- Mocking and patching
- Modules and Namespaces
- Import syntax variations
- namespaces and variable lookups
- sys.modules and the import cache
- Module objects
- Module level functionality: __dir__ and __getattr__
- Packages and the filesystem
- Relative import syntax
- Module reloading (how to do it and why not to do it)
- Circular imports, avoiding and fixing
- Executable modules and packages
A selection of some of the talks and interviews I’ve given on Python and software engineering across my career.
- UK Health Security Agency Software Development Practise Conf 2024
- PyCon UK 2023, Metaclasses in 5 Minutes Lightning Talk
- PyCon MEA 2022 How Python Took Over the World
- Test and Code Podcast Episode 145: For Those About to Mock
- PyCon Belarus 2020 How Python Took Over the World
- PyLondinium 2019 The Python Object Model
- Interview on Podcast.__init__ on testing, Mock and the Python community (2018)
- The Role of Abstractions: Lightning Talk PyCon US 2018
- Best Practises for Software Development and Testing (2017)
- PyCon UK Panel 2015
- To the Clouds: EuroPython 2015
- Automated Deployments with Juju: PyCon UK 2014
- Python and Pythons: PyCon NZ 2013
- Testing with Mock: PyCon US 2011
- A Little Bit of Python Podcast (2010-11)
- New and Improved unittest 2: PyCon US 2010
- Michael Foord on IronPython: Hanselminutes 2009
- Michael Foord on IronPython: TechEd 2007
Michael Foord: Agile Alliance Scrummaster Certification
I’ve been a fan of Agile ever since my first programming job with Resolver Systems back in 2006. I had taught myself programming and there I really learned engineering, how to build software products whilst caring about quality. I became passionate about testing as a way to ensure a minimal level of quality and about agile processes which are able to change quickly.
My only experience of Scrum was for a year at a heavily waterfall shop which layered Scrum for project management on top of heavily waterfall software development processes. It wasn’t a fun experience but I still learned a great deal.
I’ve been working with Gigaclear, as team lead on backend API servers, including One Touch Switch, for about a year now. I enjoy our software development practises and processes; we use a combination of agile for software development, devops (devsecops of course) for software deployment and maintenance, and Scrum for project management. It’s a very effective combination.
I recently attended a course with the Agile Alliance, led by John McFadyen, and became certified as a Scrummaster.
The course was fantastic and very inspiring. At Gigaclear we’re systematically evaluating all our systems, systems architecture and processes. This process and the course have been hugely inspirational and I have an article on software development processes coming shortly…
Glyph Lefkowitz: The Federation Deathmatch
It’s the weekend, and I have some Thoughts about federated social media. So, buckle up, I guess, it’s time to start some fights.
Recently there has been some discourse about Bluesky’s latest fundraising round. I’ve been participating in conversations about this on Mastodon, and I think I might sometimes come across as a Mastodon partisan, but my feelings are complex and I really don’t want to be boosting the ActivityPub Fediverse without qualification.
So here are some qualifications.
Bluesky Is EvilTo the extent that I am an ActivityPub partisan in the discourse between ActivityPub and ATProtocol, it is because I do not believe that Bluesky is a meaningfully decentralized social network. It is a social network, run by a company, which has a public API with some elements that might, one day, make it possible for it to be decentralized. But today, it is not, either practically or theoretically.
The Bluesky developers are putting in a ton of effort to maybe make it decentralized, hypothetically, someday. A lot of people think they will succeed. But ActivityPub (and, of course, Mastodon specifically) are already, today, meaningfully decentralized, as you can see on FediDB, there are instances with hundreds of thousands of people on them, before we even get to esoterica like the integrations Threads, Wordpress, Flipboard, and Ghost are doing.
The inciting incident for this post — that a lot of people are also angry about Bluesky raising millions of dollars from Evil Guys Doing Evil Stuff Capital — is indeed a serious concern. It lights the fuse that burns towards their eventual, inevitable incredible journey. ATProtocol is just an API, and that API will get shut off one day, whenever their funders get bored of the pretense of their network being “decentralized”.
At time of writing, it is also interesting that 3 of the 4 times that the CEO of Bluesky has even skeeted the word “blockchain” is to say “no blockchain”, to reassure users that the scam magnet of “Blockchain” is not actually near their product or protocol, which is a much harder position to maintain when your lead investor is “Blockchain Capital”.
I think these are all valid criticisms of Bluesky. But I also think that the actual engineers working on the product are aware of these issues, and are making a significant effort to address them or mitigate them in any way they can. All that work can still be easily incinerated by a slow quarter in terms of user growth numbers or a missed revenue forecast when the VCs are getting impatient, but it’s not nothing, it is a life’s work.
Really, who among us could not have our life’s ambitions trivially destroyed in an afternoon, simply because a billionaire decided that they should be? If you feel like you are safe from this, I have some bad news about how money works. So we are all doing our best in an imperfect system and maybe Bluesky is on to something here. That’s eminently possible. They’re certainly putting forth an earnest effort.
Mastodon Is StupidMeanwhile, not nearly as much has been made recently of Mastodon refusing funding from a variety of sources, when all indications are that funding is low, and plummeting, far below the level required to actually sustain the site, and they haven’t done a financial transparency report for over a year, and that report was already nearly a year late.
Mastodon and the fediverse are not nearly in a position to claim moral superiority over Bluesky. Sure, taking blockchain VC money might seem like a rookie mistake, but going out of business because you are spurning every possible source of funding is not that wise either.
Some might think that, sure, Mastodon the company might die but at least the Fediverse as a whole will keep going strong, right? Lots of people run their own instances! I even find elements of this argument convincing, and I think there is probably some truth to it. But to really believe this argument as claimed, that it’s a fait accompli that the fediverse will survive in some form, that all those self-run servers will be a robust network that will self-repair, requires believing some obviously false stuff. It is frankly unprofitable to run a Fediverse instance. Realistically, if you want to operate a mastodon server for yourself, it is going to cost at least $100/year once you include stuff like having a domain name, and managing the infrastructure costs is a complex problem that keeps getting harder to manage as the software itself gets slower.
Cory Doctorow has recently argued that this is all worth it, because at least on Mastodon, you’re in control, not at the whims of centralized website operators like Bluesky. In his words,
On Mastodon (and other services based on Activitypub), you can easily leave one server and go to another, and everyone you follow and everyone who follows you will move over to the new server. If the person who runs your server turns out to be imperfect in a way that you can’t endure, you can find another server, spend five minutes moving your account over, and you’re back up and running on the new server
He concludes:
Any system where users can leave without pain is a system whose owners have high switching costs and whose users have none
(Emphasis mine).
This is a beautiful vision. It is, however, an incorrect assessment of the state of the Fediverse as it stands today. It’s not true in two important ways:
First, if you look at any account of a user’s fediverse account migration, like this one from Steve Bate or this one from the Ente project or this one from Erin Kissane, you will see that it is “painful for the foreseeable future” or “wasn’t as seamless as advertised”, and that “the best time to […] migrate instances […] is never”. This language does not presage a pleasant experience, as Doctorow puts it, “without pain”.
Second, migration is an active process that requires engagement from the instance that hosts you. If you have been blocked or banned, or had your account terminated, you are just out of luck. You do not have control over your data or agency over your online identity unless you’ve shelled out the relatively exorbitant amount of money to actually operate your own instance.
In short, ActivityPub is no panacea. A federated system is not really a “decentralized” system, as much as it is a bunch of smaller centralized systems that all talk to each other. You still need to know, and care, about your social and financial relationship to the operators of your instance. There is probably no getting away from this, like, just generally on the Internet, no matter how much peer-to-peer software we deploy, but there certainly isn’t in the incomplete mess that is ActivityPub.
JOIN, or DIE.Neither Mastodon (or ActivityPub) nor Bluesky (or ATProtocol) has a comprehensive solution to the problem of decentralized social media. These companies, and these protocols, are both deeply flawed and if everything keeps bumping along as it is, I believe both are likely to fail. At different times, on different timelines, and for different reasons, but fail nonetheless.
However, these networks are both small and growing, and we are not yet in the phase of enshittification where margins are shrinking and audiences are captured and the screws must be tightened to juice revenue. There are stil possibilities. Mastodon is crowdfunded and what they lack in resources they make up for in flexibility and scrappiness. Bluesky has money and while there will eventually be a need to monetize somehow, they have plenty of runway to come up with that answer, and a lot of sophisticated protocol work has been done. Not enough to make a complete circut and allow users true, practical decentralization, but it’s not nothing, either.
Mastodon and Bluesky are both organizations with humans in them, and piles of data that is roughly schema-compatible even if the nuances and details are different. I know that there is a compatible model becuse thanks to both platforms being relatively open, there is a functioning ActivityPub/ATProtocol bridge in the form of Brid.gy Fed. You can use it today, and I highly recommend that you do so, so that “choice of protocol” does not fully define your audience. If you’re on bluesky, follow this account, and if you’re on Mastodon or elsewhere on the Fediverse, search for and follow @bsky.brid.gy@bsky.brid.gy.
The reality that fans of decentralized, independent social media must confront is that we are a tiny audicence right now. Whichever site we are looking at, we are talking about a few million monthly active users at best, in a world where even the pathetic husk of Twitter still has hundreds of millions and Facebook has billions. Interneceine fights are not going to get us anywhere. We need to build bridges and links and connect our networks as densely as possible. If I’m being honest, Bridgy Fed looks like a pretty janky solution, but it’s something, and we need to start doing something soon, so we do not collectively become a permanent minority that mass markets can safely ignore.
As users, we need to set an example, so that the developers of the respective platforms get their shit together and work together directly so that workarounds like Bridgy are not required. Frankly, this is mostly on the ActivityPub and Mastodon devs, as far as I can tell. Unfortunately, not a lot of this seems to be public, or at least I haven’t witnessed a lot of it directly, but I have heard repeatedly that the ActivityPub developers are prickly, and this is one high-profile public example where an ActivityPub partisan is incredibly, pointlessly hostile and borderline harrassing towards someone — Mike Masnick, a long-time staunch advocate for open protocols and open patents, someone with a Mastodon account, and thus as good a prospective ally as the ActivityPub fediverse might reasonably find — explaining some of the relative benefits of Bluesky.
Most of us are technology nerds in one way or another. In that way we can look at signifiers like “ActivityPub” and “ATProtocol”, and feel like these are hard boundaries around different all-encompassing structures for the future, and thus tribes we must join and support.
A better way to look at this, however, is to see social entities like Mastodon gGmbH and Bluesky PBC — or, more to the point, Fosstodon, SFBA Social, Hachyderm (and maybe, one day, even an instance which isn’t fully just for software development nerds), as groups that deploy these protocols to access some data that they publish, just as they might publish their website over HTTP or their newsletters over SMTP. There are technical challenges involved in bridging between mutually unintelligible domain models, but that is, like, network software's whole deal. Most software is just some kind of translation from one format or context to another. The best possible future for the fediverse is the one where users care as much about the distinction between ATProtocol and ActivityPub as they do about the distinction between POP3 and IMAP.
To both developers and users of these systems, I say: get it together. Be nice to each other. Because the rest of the social media ecosystem is sure as shit not going to be nice to us if we ever see even a hint of success and start to actually cut into their user base.
AcknowledgmentsThank you to my patrons who are supporting my writing on this blog. If you like what you’ve read here and you’d like to read more of it, or you’d like to support my various open-source endeavors, you can support my work as a sponsor!