Feeds

Montreal Python User Group: Montréal-Python 79 – Quadratic Judo

Planet Python - Sun, 2020-08-09 00:00

Summer nears its end and Montréal-Python is coming back from vacation. Before going back to school or to work, come and tell us what new Pythonic things during the summer. As always, we are looking for passionate people who want to share with the community. Be it about a project that you have built or a library that you learned, send us your talk proposal to mtlpyteam@googlegroups.com . Talks duration can be anywhere between 5 and 20 minutes. We accept all reasonable proposals until the program is complete. We can't wait to read your proposal!

Montréal-Python 79, Quadratic Judo, will be live streamed on our Youtube Channel on August 24 at 5:30 PM EST.

As usual, we want to bring your attention to the Montréal-Python code of conduct. We remind you that we are an open community and actively invite people of all minorities to come and share their experiences and their knowledge. To align with our values, we also remember those wise words by the Python Software Foundation:

In silence we are complicit. To eradicate racism, xenophobia, and all other forms of inequality, we must take action. It is our responsibility to use our platforms to amplify the voices of the oppressed and marginalized.

We ask today – and always – that the Python community comes together in support of the Black community, and makes Black voices heard.

Black lives matter.

And finally, PyBay 2020 moved to a fully online conference taking place next weekend, August 15 and 16. They offer free tickets to women and to members of many minorities. Check out their website for all the details:

https://pybay.com/

Categories: FLOSS Project Planets

Full Stack Python: How to Transcribe Speech Recordings into Text with Python

Planet Python - Sun, 2020-08-09 00:00

When you have a recording where one or more people are talking, it's useful to have a highly accurate and automated way to extract the spoken words into text. Once you have the text, you can use it for further analysis or as an accessibility feature.

In this tutorial, we'll use a high accuracy speech-to-text web application programming interface called AssemblyAI to extract text from an MP3 recording (many other formats are supported as well).

With the code from this tutorial, you will be able to take an audio file that contains speech such as this example one I recorded and output a highly accurate text transcription like this:

An object relational mapper is a code library that automates the transfer of data stored in relational, databases into objects that are more commonly used in application code or EMS are useful because they provide a high level abstraction upon a relational database that allows developers to write Python code instead of sequel to create read update and delete, data and schemas in their database. Developers can use the programming language. They are comfortable with to work with a database instead of writing SQL... (the text goes on from here but I abbreviated it at this point) Tutorial requirements

Throughout this tutorial we are going to use the following dependencies, which we will install in just a moment. Make sure you also have Python 3, preferrably 3.6 or newer installed, in your environment:

We will use the following dependencies to complete this tutorial:

All code in this blog post is available open source under the MIT license on GitHub under the transcribe-speech-text-script directory of the blog-code-examples repository. Use the source code as you desire for your own projects.

Setting up the development environment

Change into the directory where you keep your Python virtual environments. I keep mine in a subdirectory named venvs within my user's home directory. Create a new virtualenv for this project using the following command.

python3 -m venv ~/venvs/pytranscribe

Activate the virtualenv with the activate shell script:

source ~/venvs/pytranscribe/bin/activate

After the above command is executed, the command prompt will change so that the name of the virtualenv is prepended to the original command prompt format, so if your prompt is simply $, it will now look like the following:

(pytranscribe) $

Remember, you have to activate your virtualenv in every new terminal window where you want to use dependencies in the virtualenv.

We can now install the requests package into the activated but otherwise empty virtualenv.

pip install requests==2.24.0

Look for output similar to the following to confirm the appropriate packages were installed correctly from PyPI.

(pytranscribe) $ pip install requests==2.24.0 Collecting requests==2.24.0 Using cached https://files.pythonhosted.org/packages/45/1e/0c169c6a5381e241ba7404532c16a21d86ab872c9bed8bdcd4c423954103/requests-2.24.0-py2.py3-none-any.whl Collecting certifi>=2017.4.17 (from requests==2.24.0) Using cached https://files.pythonhosted.org/packages/5e/c4/6c4fe722df5343c33226f0b4e0bb042e4dc13483228b4718baf286f86d87/certifi-2020.6.20-py2.py3-none-any.whl Collecting urllib3!=1.25.0,!=1.25.1,<1.26,>=1.21.1 (from requests==2.24.0) Using cached https://files.pythonhosted.org/packages/9f/f0/a391d1463ebb1b233795cabfc0ef38d3db4442339de68f847026199e69d7/urllib3-1.25.10-py2.py3-none-any.whl Collecting chardet<4,>=3.0.2 (from requests==2.24.0) Using cached https://files.pythonhosted.org/packages/bc/a9/01ffebfb562e4274b6487b4bb1ddec7ca55ec7510b22e4c51f14098443b8/chardet-3.0.4-py2.py3-none-any.whl Collecting idna<3,>=2.5 (from requests==2.24.0) Using cached https://files.pythonhosted.org/packages/a2/38/928ddce2273eaa564f6f50de919327bf3a00f091b5baba8dfa9460f3a8a8/idna-2.10-py2.py3-none-any.whl Installing collected packages: certifi, urllib3, chardet, idna, requests Successfully installed certifi-2020.6.20 chardet-3.0.4 idna-2.10 requests-2.24.0 urllib3-1.25.10

We have all of our required dependencies installed so we can get started coding the application.

Uploading, initiating and transcribing audio

We have everything we need to start building our application that will transcribe audio into text. We're going to build this application in three files:

  1. upload_audio_file.py: uploads your audio file to a secure place on AssemblyAI's service so it can be access for processing. If your audio file is already accessible with a public URL, you don't need to do this step, you can just follow this quickstart
  2. initiate_transcription.py: tells the API which file to transcribe and to start immediately
  3. get_transcription.py: prints the status of the transcription if it is still processing, or displays the results of the transcription when the process is complete

Create a new directory named pytranscribe to store these files as we write them. Then change into the new project directory.

mkdir pytranscibe cd pytranscribe

We also need to export our AssemblyAI API key as an environment variable. Sign up for an AssemblyAI account and log in to the AssemblyAI dashboard, then copy "Your API token" as shown in this screenshot:

export ASSEMBLYAI_KEY=your-api-key-here

Note that you must use the export command in every command line window that you want this key to be accessible. The scripts we are writing will not be able to access the API if you do not have the token exported as ASSEMBLYAI_KEY in the environment you are running the script.

Now that we have our project directory created and the API key set as an environment variable, let's move on to writing the code for the first file that will upload audio files to the AssemblyAI service.

Uploading the audio file for transcription

Create a new file named upload_audio_file.py and place the following code in it:

import argparse import os import requests API_URL = "https://api.assemblyai.com/v2/" def upload_file_to_api(filename): """Checks for a valid file and then uploads it to AssemblyAI so it can be saved to a secure URL that only that service can access. When the upload is complete we can then initiate the transcription API call. Returns the API JSON if successful, or None if file does not exist. """ if not os.path.exists(filename): return None def read_file(filename, chunk_size=5242880): with open(filename, 'rb') as _file: while True: data = _file.read(chunk_size) if not data: break yield data headers = {'authorization': os.getenv("ASSEMBLYAI_KEY")} response = requests.post("".join([API_URL, "upload"]), headers=headers, data=read_file(filename)) return response.json()

The above code imports the argparse, os and requests packages so that we can use them in this script. The API_URL is a constant that has the base URL of the AssemblyAI service. We define the upload_file_to_api function with a single argument, filename that should be a string with the absolute path to a file and its filename.

Within the function, we check that the file exists, then use Request's chunked transfer encoding to stream large files to the AssemblyAI API.

The os module's getenv function reads the API that was set on the command line using the export command with the getenv. Make sure that you use that export command in the terminal where you are running this script otherwise that ASSEMBLYAI_KEY value will be blank. When in doubt, use echo $ASSEMBLY_AI to see if the value matches your API key.

To use the upload_file_to_api function, append the following lines of code in the upload_audio_file.py file so that we can properly execute this code as a script called with the python command:

if __name__ == "__main__": parser = argparse.ArgumentParser() parser.add_argument("filename") args = parser.parse_args() upload_filename = args.filename response_json = upload_file_to_api(upload_filename) if not response_json: print("file does not exist") else: print("File uploaded to URL: {}".format(response_json['upload_url']))

The code above creates an ArgumentParser object that allows the application to obtain a single argument from the command line to specify the file we want to access, read and upload to the AssmeblyAI service.

If the file does not exist, the script will print a message that the file couldn't be found. In the happy path where we do find the correct file at that path, then the file is uploaded using the code in upload_file_to_api function.

Execute the completed upload_audio_file.py script by running it on the command line with the python command. Replace FULL_PATH_TO_FILE with an absolute path to the file you want to upload, such as /Users/matt/devel/audio.mp3.

python upload_audio_file.py FULL_PATH_TO_FILE

Assuming the file is found at the location that you specified, when the script finishes uploading the file, it will print a message like this one with a unique URL:

File uploaded to URL: https://cdn.assemblyai.com/upload/463ce27f-0922-4ea9-9ce4-3353d84b5638

This URL is not public, it can only be used by the AssemblyAI service, so no one else will be able to access your file and its contents except for you and their transcription API.

The part that is important is the last section of the URL, in this example it is 463ce27f-0922-4ea9-9ce4-3353d84b5638. Save that unique identifier because we need to pass it into the next script that initiates the transcription service.

Initiate transcription

Next, we'll write some code to kick off the transcription. Create a new file named initiate_transcription.py. Add the following code to the new file.

import argparse import os import requests API_URL = "https://api.assemblyai.com/v2/" CDN_URL = "https://cdn.assemblyai.com/" def initiate_transcription(file_id): """Sends a request to the API to transcribe a specific file that was previously uploaded to the API. This will not immediately return the transcription because it takes a moment for the service to analyze and perform the transcription, so there is a different function to retrieve the results. """ endpoint = "".join([API_URL, "transcript"]) json = {"audio_url": "".join([CDN_URL, "upload/{}".format(file_id)])} headers = { "authorization": os.getenv("ASSEMBLYAI_KEY"), "content-type": "application/json" } response = requests.post(endpoint, json=json, headers=headers) return response.json()

We have the same imports as the previous script and we've added a new constant, CDN_URL that matches the separate URL where AssemblyAI stores the uploaded audio files.

The initiate_transcription function essentially just sets up a single HTTP request to the AssemblyAI API to start the transcription process on the audio file at the specific URL passed in. This is why passing in the file_id is important: that completes the URL of the audio file that we are telling AssemblyAI to retrieve.

Finish the file by appending this code so that it can be easily invoked from the command line with arguments.

if __name__ == "__main__": parser = argparse.ArgumentParser() parser.add_argument("file_id") args = parser.parse_args() file_id = args.file_id response_json = initiate_transcription(file_id) print(response_json)

Start the script by running the python command on the initiate_transcription file and pass in the unique file identifier you saved from the previous step.

# the FILE_IDENTIFIER is returned in the previous step and will # look something like this: 463ce27f-0922-4ea9-9ce4-3353d84b5638 python initiate_transcription.py FILE_IDENTIFIER

The API will send back a JSON response that this script prints to the command line.

{'audio_end_at': None, 'acoustic_model': 'assemblyai_default', 'text': None, 'audio_url': 'https://cdn.assemblyai.com/upload/463ce27f-0922-4ea9-9ce4-3353d84b5638', 'speed_boost': False, 'language_model': 'assemblyai_default', 'redact_pii': False, 'confidence': None, 'webhook_status_code': None, 'id': 'gkuu2krb1-8c7f-4fe3-bb69-6b14a2cac067', 'status': 'queued', 'boost_param': None, 'words': None, 'format_text': True, 'webhook_url': None, 'punctuate': True, 'utterances': None, 'audio_duration': None, 'auto_highlights': False, 'word_boost': [], 'dual_channel': None, 'audio_start_from': None}

Take note of the value of the id key in the JSON response. This is the transcription identifier we need to use to retrieve the transcription result. In this example, it is gkuu2krb1-8c7f-4fe3-bb69-6b14a2cac067. Copy the transcription identifier in your own response because we will need it to check when the transcription process has completed in the next step.

Retrieving the transcription result

We have uploaded and begun the transcription process, so let's get the result as soon as it is ready.

How long it takes to get the results back can depend on the size of the file, so this next script will send an HTTP request to the API and report back the status of the transcription, or print the output if it's complete.

Create a third Python file named get_transcription.py and put the following code into it.

import argparse import os import requests API_URL = "https://api.assemblyai.com/v2/" def get_transcription(transcription_id): """Requests the transcription from the API and returns the JSON response.""" endpoint = "".join([API_URL, "transcript/{}".format(transcription_id)]) headers = {"authorization": os.getenv('ASSEMBLYAI_KEY')} response = requests.get(endpoint, headers=headers) return response.json() if __name__ == "__main__": parser = argparse.ArgumentParser() parser.add_argument("transcription_id") args = parser.parse_args() transcription_id = args.transcription_id response_json = get_transcription(transcription_id) if response_json['status'] == "completed": for word in response_json['words']: print(word['text'], end=" ") else: print("current status of transcription request: {}".format( response_json['status']))

The code above has the same imports as the other scripts. In this new get_transcription function, we simply call the AssemblyAI API with our API key and the transcription identifier from the previous step (not the file identifier). We retrieve the JSON response and return it.

In the main function we handle the transcription identifier that is passed in as a command line argument and pass it into the get_transcription function. If the response JSON from the get_transcription function contains a completed status then we print the results of the transcription. Otherwise, print the current status which is either queued or processing before it is completed.

Call the script using the command line and the transcription identifier from the previous section:

python get_transcription.py TRANSCRIPTION_ID

If the service has not yet started working on the transcript then it will return queued like this:

current status of transcription request: queued

When the service is currently working on the audio file it will return processing:

current status of transcription request: processing

When the process is completed, our script will return the text of the transcription, like you see here:

An object relational mapper is a code library that automates the transfer of data stored in relational, databases into objects that are more commonly used in application code or EMS are useful because they provide a high level ...(output abbreviated)

That's it, we've got our transcription!

You may be wondering what to do if the accuracy isn't where you need it to be for your situation. That is where boosting accuracy for keywords or phrases and selecting a model that better matches your data come in. You can use either of those two methods to boost the accuracy of your recordings to an acceptable level for your situation.

What's next?

We just finished writing some scripts that call the AssemblyAI API to transcribe recordings with speech into text output.

Next, take a look at some of their more advanced documentation that goes beyond the basics in this tutorial:

Questions? Let me know via an issue ticket on the Full Stack Python repository, on Twitter @fullstackpython or @mattmakai. See something wrong with this post? Fork this page's source on GitHub and submit a pull request.

Categories: FLOSS Project Planets

Catalin George Festila: Python 3.8.5 : Pearson Product Moment Correlation with corrcoef from numpy.

Planet Python - Sat, 2020-08-08 23:36
The python package named numpy come with corrcoef function to return Pearson product-moment correlation coefficients. This method has a limitation in that it can compute the correlation matrix between two variables only. The full name is the Pearson Product Moment Correlation (PPMC). The PPMC is not able to tell the difference between dependent variables and independent variables. The
Categories: FLOSS Project Planets

Zato Blog: Zato and Docker installation options - general overview

Planet Python - Sat, 2020-08-08 21:47

Docker is a containerization platform that gained immense popularity in the IT world as a tool that can contain an application and help to deploy it to multiple environments.

History

Before Docker, there were different approaches whose goal was to obtain a reliable development and deployment process, which means testing and putting the application into production as a reproducible unit where the same result is always achieved.

The leading way to isolate and organize applications along with their their dependencies was to place each and every application in its own virtual machine. Physical hardware runs multiple such environments and the whole process is called virtualization.

Containerization

Containerization is a type of virtualization which brings virtualization to the operating system level. Containerization brings abstraction to the operating system, while virtualization brings abstraction to the hardware.

  • Containers are lightweight and typically faster than Virtual Machines
  • Containers share relevant libraries and resources
  • The boot up process is very fast since the applications run on the host kernel
Docker

Docker is a tool used to easily package, distribute and deploy applications with all its dependencies into a standardized unit called a Docker image. It provides the ability to create predictable environments which can be used for development, testing and deploying the application in multiple environments, including production.

Some Docker concepts:

  • Image: an executable package that includes everything needed to run the application
  • Container: a runtime instance of an image

A Docker container can be connected with other containers that provide functionality to the application like a database or a load balancer.

In Zato we create Docker images of two kinds:

The idea behind the Zato quickstart image is to very quickly provision a fully working all-in-one Zato cluster that can be used to test a new service locally or explore Zato but, because this is a real Zato cluster, it can also be used for production purposes.

The image used for cloud environments provides more flexibility and can be used to deploy an instance of a Zato component like server, scheduler or web admin into the Zato cluster, each under its own container. That is, whereas Zato quickstart contains all the Zato components in a single image, the cloud one splits the components into separate Docker containers.

Wrapping up

This is it for now - we will be talking about these two types of Zato images and how to use them in upcoming articles.

Categories: FLOSS Project Planets

Charles Plessy: Thank you, VAIO

Planet Debian - Sat, 2020-08-08 21:01

I use everyday a VAIO Pro mk2 that I bought 5 years ago with 3 years of warranty. It has been a few months that I was noticing that something was slowly inflating inside. In July, things accelerated to the point that its thickness had doubled. After we called the customer service of VAIO, somebody came to pick up the laptop in order to make a cost estimate. Then we learned on the phone that it would be free. It is back in my hands in less than two weeks. Bravo VAIO !

Categories: FLOSS Project Planets

Senior Developers don’t know Everything

Planet KDE - Sat, 2020-08-08 18:00

For about 20 years, I’ve been doing C++ and Qt and KDE development. I suppose that makes me a “senior software engineer”, also in the sense that I’ve hacked, programmed, futzed, designed, architected, tested, proved-correct, and cursed at a lot of software. But don’t let the label fool you: I look up just as much in the documentation as I ever did; senior developers don’t know everything.

Documentation I looked up today:

Some of that documentation is going to remain fresh in my mind for a bit, some is going to be forgotten almost immediately (like the JSON schema bits: I need it to write a schema, but the schema is write-and-forget because automatic tools take over from there at least until the schema needs a change, which could be over a year from now).

None of this information is essential to my work as a developer as long as I know how to look it up, and can do so quickly and effectively: trivia knowledge is not a measure of developer worthyness.

So, you know, if you’re interviewing for a job position and get quizzes on this kind of things, then there are two possibilities:

  • the job is all about doing-this-one-thing, in which case the knowledge is necessary and yes, knowing it already is a thing, or
  • this is an incidental, once-a-year item where gatekeeping over it is unadulterated cow poop (from which, I learned this week, you can distill vanilla aroma).

Keep learning, but don’t be afraid to forget.

Categories: FLOSS Project Planets

Holger Levsen: 20200808-debconf8

Planet Debian - Sat, 2020-08-08 12:10
DebConf8

This tshirt is 12 years old and from DebConf8.

DebConf8 was my 6th DebConf and took place in Mar de la Plata, Argentina.

Also this is my 6th post in this series of posts about DebConfs and for the last two days for the first time I failed my plan to do one post per day. And while two days ago I still planned to catch up on this by doing more than one post in a day, I have now decided to give in to realities, which mostly translates to sudden fantastic weather in Hamburg and other summer related changes in life. So yeah, I still plan to do short posts about all the DebConfs I was lucky to attend, but there might be days without a blog post. Anyhow, Mar de la Plata.

When we held DebConf in Argentina it was winter there, meaning locals and other folks would wear jackets, scarfs, probably gloves, while many Debian folks not so much. Andreas Tille freaked out and/or amazed local people by going swimming in the sea every morning. And when I told Stephen Gran that even I would find it a bit cold with just a tshirt he replied "na, the weather is fine, just like british summer", while it was 14 celcius and mildly raining.

DebConf8 was the first time I've met Valessio Brito, who I had worked together since at least DebConf6. That meeting was really super nice, Valessio is such a lovely person. Back in 2008 however, there was just one problem: his spoken English was worse than his written one, and that was already hard to parse sometimes. Fast forward eleven years to Curitiba last year and boom, Valessio speaks really nice English now.

And, you might wonder why I'm telling this, especially if you were exposed to my Spanish back then and also now. So my point in telling this story about Valessio is to illustrate two things: a.) one can contribute to Debian without speaking/writing much English, Valessio did lots of great artwork since DebConf6 and b.) one can learn English by doing Debian stuff. It worked for me too!

During set up of the conference there was one very memorable moment, some time after the openssl maintainer arrived at the venue: Shortly before DebConf8 Luciano Bello, from Argentina no less, had found CVE-2008-0166 which basically compromised the security of sshd of all Debian and Ubuntu installations done in the last 4 years (IIRC two Debian releases were affected) and which was commented heavily everywhere. So poor Q arrived and wondered whether we would all hate him, how many toilets he would have to clean and what not... And then, someone rather quickly noticed this, approached some people and suddenly all 23-42 people at DebConf were group-hugging Q and then we were all smiling and continuing doing set up of the conference.

That moment is one of my most joyful memories of all DebConfs and partly explains why I remember little about the conference itself, everything else pales in comparison and most things pale over the years anyway. As I remember it, the conference ran very smoothly in the end, despite quite some organisational problems right before the start. But as usual, once the geeks arrive and are happily geeking, things start to run smooth, also because Debian people are kind and smart and give hands and brain were needed.

And like other DebConfs, Mar de la Plata also had moments which I want to share but I will only hint about, so it's up to you to imagine the special leaves which were brought to that cheese and wine party!

Categories: FLOSS Project Planets

Reproducible Builds: Reproducible Builds in July 2020

Planet Debian - Sat, 2020-08-08 11:52

Welcome to the July 2020 report from the Reproducible Builds project.

In these monthly reports, we round-up the things that we have been up to over the past month. As a brief refresher, the motivation behind the Reproducible Builds effort is to ensure no flaws have been introduced from the original free software source code to the pre-compiled binaries we install on our systems. (If you’re interested in contributing to the project, please visit our main website.)

General news

At the upcoming DebConf20 conference (now being held online), Holger Levsen will present a talk on Thursday 27th August about “Reproducing Bullseye in practice”, focusing on independently verifying that the binaries distributed from ftp.debian.org were made from their claimed sources.

Tavis Ormandy published a blog post making the provocative claim that “You don’t need reproducible builds”, asserting elsewhere that the many attacks that have been extensively reported in our previous reports are “fantasy threat models”. A number of rebuttals have been made, including one from long-time contributor Reproducible Builds contributor Bernhard Wiedemann.

On our mailing list this month, Debian Developer Graham Inggs posted to our list asking for ideas why the openorienteering-mapper Debian package was failing to build on the Reproducible Builds testing framework. Chris Lamb remarked from the build logs that the package may be missing a build dependency, although Graham then used our own diffoscope tool to show that the resulting package remains unchanged with or without it. Later, Nico Tyni noticed that the build failure may be due to the relationship between the FILE C preprocessor macro and the -ffile-prefix-map GCC flag.

An issue in Zephyr, a small-footprint kernel designed for use on resource-constrained systems, around .a library files not being reproducible was closed after it was noticed that a key part of their toolchain was updated that now calls --enable-deterministic-archives by default.

Reproducible Builds developer kpcyrd commented on a pull request against the libsodium cryptographic library wrapper for Rust, arguing against the testing of CPU features at compile-time. He noted that:

I’ve accidentally shipped broken updates to users in the past because the build system was feature-tested and the final binary assumed the instructions would be present without further runtime checks

David Kleuker also asked a question on our mailing list about using SOURCE_DATE_EPOCH with the install(1) tool from GNU coreutils. When comparing two installed packages he noticed that the filesystem ‘birth times’ differed between them. Chris Lamb replied, realising that this was actually a consequence of using an outdated version of diffoscope and that a fix was in diffoscope version 146 released in May 2020.

Later in July, John Scott posted asking for clarification regarding on the Javascript files on our website to add metadata for LibreJS, the browser extension that blocks non-free Javascript scripts from executing. Chris Lamb investigated the issue and realised that we could drop a number of unused Javascript files [][][] and added unminified versions of Bootstrap and jQuery [].


Development work Website

On our website this month, Chris Lamb updated the main Reproducible Builds website and documentation to drop a number of unused Javascript files [][][] and added unminified versions of Bootstrap and jQuery []. He also fixed a number of broken URLs [][].

Gonzalo Bulnes Guilpain made a large number of grammatical improvements [][][][][] as well as some misspellings, case and whitespace changes too [][][].

Lastly, Holger Levsen updated the README file [], marked the Alpine Linux continuous integration tests as currently disabled [] and linked the Arch Linux Reproducible Status page from our projects page [].

diffoscope

diffoscope is our in-depth and content-aware diff utility that can not only locate and diagnose reproducibility issues, it provides human-readable diffs of all kinds. In July, Chris Lamb made the following changes to diffoscope, including releasing versions 150, 151, 152, 153 & 154:

  • New features:

    • Add support for flash-optimised F2FS filesystems. (#207)
    • Don’t require zipnote(1) to determine differences in a .zip file as we can use libarchive. []
    • Allow --profile as a synonym for --profile=-, ie. write profiling data to standard output. []
    • Increase the minimum length of the output of strings(1) to eight characters to avoid unnecessary diff noise. []
    • Drop some legacy argument styles: --exclude-directory-metadata and --no-exclude-directory-metadata have been replaced with --exclude-directory-metadata={yes,no}. []
  • Bug fixes:

    • Pass the absolute path when extracting members from SquashFS images as we run the command with working directory in a temporary directory. (#189)
    • Correct adding a comment when we cannot extract a filesystem due to missing libguestfs module. []
    • Don’t crash when listing entries in archives if they don’t have a listed size such as hardlinks in ISO images. (#188)
  • Output improvements:

    • Strip off the file offset prefix from xxd(1) and show bytes in groups of 4. []
    • Don’t emit javap not found in path if it is available in the path but it did not result in an actual difference. []
    • Fix ... not available in path messages when looking for Java decompilers that used the Python class name instead of the command. []
  • Logging improvements:

    • Add a bit more debugging info when launching libguestfs. []
    • Reduce the --debug log noise by truncating the has_some_content messages. []
    • Fix the compare_files log message when the file does not have a literal name. []
  • Codebase improvements:

    • Rewrite and rename exit_if_paths_do_not_exist to not check files multiple times. [][]
    • Add an add_comment helper method; don’t mess with our internal list directly. []
    • Replace some simple usages of str.format with Python ‘f-strings’ [] and make it easier to navigate to the main.py entry point [].
    • In the RData comparator, always explicitly return None in the failure case as we return a non-None value in the success one. []
    • Tidy some imports [][][] and don’t alias a variable when we do not use it. []
    • Clarify the use of a separate NullChanges quasi-file to represent missing data in the Debian package comparator [] and clarify use of a ‘null’ diff in order to remember an exit code. []
  • Other changes:

Jean-Romain Garnier also made the following changes:

  • Allow passing a file with a list of arguments via diffoscope @args.txt. (!62)
  • Improve the output of side-by-side diffs by detecting added lines better. (!64)
  • Remove offsets before instructions in objdump [][] and remove raw instructions from ELF tests [].
Other tools

strip-nondeterminism is our tool to remove specific non-deterministic results from a completed build. It is used automatically in most Debian package builds. In July, Chris Lamb ensured that we did not install the internal handler documentation generated from Perl POD documents [] and fixed a trivial typo []. Marc Herbert added a --verbose-level warning when the Archive::Cpio Perl module is missing. (!6)

reprotest is our end-user tool to build same source code twice in widely differing environments and then checks the binaries produced by each build for any differences. This month, Vagrant Cascadian made a number of changes to support diffoscope version 153 which had removed the (deprecated) --exclude-directory-metadata and --no-exclude-directory-metadata command-line arguments, and updated the testing configuration to also test under Python version 3.8 [].


Distributions Debian

In June 2020, Timo Röhling filed a wishlist bug against the debhelper build tool impacting the reproducibility status of hundreds of packages that use the CMake build system. This month however, Niels Thykier uploaded debhelper version 13.2 that passes the -DCMAKE_SKIP_RPATH=ON and -DBUILD_RPATH_USE_ORIGIN=ON arguments to CMake when using the (currently-experimental) Debhelper compatibility level 14.

According to Niels, this change:

… should fix some reproducibility issues, but may cause breakage if packages run binaries directly from the build directory.

34 reviews of Debian packages were added, 14 were updated and 20 were removed this month adding to our knowledge about identified issues. Chris Lamb added and categorised the nondeterministic_order_of_debhelper_snippets_added_by_dh_fortran_mod [] and gem2deb_install_mkmf_log [] toolchain issues.

Lastly, Holger Levsen filed two more wishlist bugs against the debrebuild Debian package rebuilder tool [][].

openSUSE

In openSUSE, Bernhard M. Wiedemann published his monthly Reproducible Builds status update.

Bernhard also published the results of performing 12,235 verification builds of packages from openSUSE Leap version 15.2 and, as a result, created three pull requests against the openSUSE Build Result Compare Script [][][].

Other distributions

In Arch Linux, there was a mass rebuild of old packages in an attempt to make them reproducible. This was performed because building with a previous release of the pacman package manager caused file ordering and size calculation issues when using the btrfs filesystem.

A system was also implemented for Arch Linux packagers to receive notifications if/when their package becomes unreproducible, and packagers now have access to a dashboard where they can all see all their unreproducible packages (more info).

Paul Spooren sent two versions of a patch for the OpenWrt embedded distribution for adding a ‘build system’ revision to the ‘packages’ manifest so that all external feeds can be rebuilt and verified. [][]

Upstream patches

The Reproducible Builds project detects, dissects and attempts to fix as many currently-unreproducible packages as possible. We endeavour to send all of our patches upstream where appropriate. This month, we wrote a large number of these patches, including:

Vagrant Cascadian also reported two issues, the first regarding a regression in u-boot boot loader reproducibility for a particular target [] and a non-deterministic segmentation fault in the guile-ssh test suite []. Lastly, Jelle van der Waa filed a bug against the MeiliSearch search API to report that it embeds the current build date.

Testing framework

We operate a large and many-featured Jenkins-based testing framework that powers tests.reproducible-builds.org.

This month, Holger Levsen made the following changes:

  • Debian-related changes:

    • Tweak the rescheduling of various architecture and suite combinations. [][]
    • Fix links for ‘404’ and ‘not for us’ icons. (#959363)
    • Further work on a rebuilder prototype, for example correctly processing the sbuild exit code. [][]
    • Update the sudo configuration file to allow the node health job to work correctly. []
    • Add php-horde packages back to the pkg-php-pear package set for the bullseye distribution. []
    • Update the version of debrebuild. []
  • System health check development:

    • Add checks for broken SSH [], logrotate [], pbuilder [], NetBSD [], ‘unkillable’ processes [], unresponsive nodes [][][][], proxy connection failures [], too many installed kernels [], etc.
    • Automatically fix some failed systemd units. []
    • Add notes explaining all the issues that hosts are experiencing [] and handle zipped job log files correctly [].
    • Separate nodes which have been automatically marked as down [] and show status icons for jobs with issues [].
  • Misc:

In addition, Mattia Rizzolo updated the init_node script to suggest using sudo instead of explicit logout and logins [][] and the usual build node maintenance was performed by Holger Levsen [][][][][][], Mattia Rizzolo [][] and Vagrant Cascadian [][][][].


If you are interested in contributing to the Reproducible Builds project, please visit our Contribute page on our website. However, you can get in touch with us via:

Categories: FLOSS Project Planets

PSF GSoC students blogs: Week 6 Check-in

Planet Python - Sat, 2020-08-08 11:02

What I have done this week

Works towards analyzing multistage dockerfile. I combined the draft PR and the review from my mentors, the new commit is the first step of my plan. We split the multistage dockerfile into seperate dockefiles for build. Here are the changes in the new commit.

1. Modified function check_multistage_dockerfile() to return.

2. Remove function split_multistage_dockerfile() since we are working on the building stage. split_multistage_dockerfile() can be improved on analyze stage.

To Do

1. Improve readability for function check_multistage_dockerfile().

2. Try build images and analyze on them.

Did I get stuck somewhere?

Not yet.

 

 

Categories: FLOSS Project Planets

Codementor: Send WhatsApp media/message using Python.

Planet Python - Sat, 2020-08-08 07:17
Send WhatsApp message/media using python.
Categories: FLOSS Project Planets

Thorsten Alteholz: My Debian Activities in July 2020

Planet Debian - Sat, 2020-08-08 05:54

FTP master

This month I accepted 434 packages and rejected 54. The overall number of packages that got accepted was 475.

Debian LTS

This was my seventy-third month that I did some work for the Debian LTS initiative, started by Raphael Hertzog at Freexian.

This month my all in all workload has been 25.25h. During that time I did LTS uploads of:

  • [DLA 2289-1] mupdf security update for five CVEs
  • [DLA 2290-1] e2fsprogs security update for one CVE
  • [DLA 2294-1] salt security update for two CVEs
  • [DLA 2295-1] curl security update for one CVE
  • [DLA 2296-1] luajit security update for one CVE
  • [DLA 2298-1] libapache2-mod-auth-openidc security update for three CVEs

I started to work on python2.7 as well but stumbled over some hurdles in the testsuite, so I did not upload a fixed version yet.

This month was much influenced by the transition from Jessie LTS to Stretch LTS and one or another workflow/script needed some adjustments.

Last but not least I did some days of frontdesk duties.

Debian ELTS

This month was the twenty fifth ELTS month.

During my allocated time I uploaded:

  • ELA-230-1 for luajit
  • ELA-231-1 for curl

Like in LTS, I also started to work on python2.7 and encountered the same hurdles in the testsuite. So I did not upload a fixed version for ELTS as well.

Last but not least I did some days of frontdesk duties.

Other stuff

In this section nothing much happened this month. Thanks a lot to everybody who NMUed a package to fix a bug.

Categories: FLOSS Project Planets

How to rock: First tips

Planet KDE - Sat, 2020-08-08 04:28

After working for some time collaborating with open source/free software projects, I started to help newcomers to contribute and push the development further with good practices. The following post will try to itemize some important tips that can be used for software collaboration and personal projects.

Code practices
  • Encapsulate magic variables:

    ...// First version start_communication(0, 1, 232); // Reviewer: What is the meaning of 0 ? What is 1 ? Why 232 ? ...// Second version enum class Messages { ... RequestStatus = 232, } ... const uint8_t vid = 0; const uint8_t cid = 1; start_communication(vid, cid, Messages::RequestStatus); // Reviewer: What is vid ? What is cid ? ...//Final version ... const uint8_t vehicle_id = 0; const uint8_t component_id = 1; start_communication(vehicle_id, component_id, Messages::RequestStatus);

    As you can see, the final version makes everything more readable, we know that we are starting the communication with a vehicle that has an id of 0 and the vehicle probably contains a component with id 1, and while calling this function we are also requesting the status. Much better than 0, 1 and 232 right ? Doing this will help the reviewers and future developers to understand what is the meaning of such numbers. It's also necessary to avoid variables that contains only single letters or really short abbreviations, is much harder to understand this: $$C^2 = A^2 + B^2$$ over this: $$hypotenuse^2 = catheti_1^2 + catheti_2^2$$

  • Avoid multiple arguments:

    ...// First version auto vehicle = new Vehicle( VehicleType::Car, 4, {28, 31}, Fuel::Electric, 1.2, 613 ); // Reviewer: What is the meaning of all this values ? // How can we make it better ? ...// Second version auto vehicle = new Vehicle(VehicleType::Car) vehicle->setNumberOfTires(4); vehicle->setTirePressure({28, 31}); vehicle->setFuel(Fuel::Electric); vehicle->setWeightInTons(1.2); vehicle->setAutonomyInKm(613); ...// It's also possible to use aggregate initialization in C++20 // and user-defined literals from C++11 auto vehicle = new Vehicle({ .type = VehicleType::Car, .numberOfTires = 4, .tirePressure = {28_psi, 31_psi}, .fuel = Fuel::Electric, .weight = 1.2_tn, .autonomy = 613_km, });

    Both second and C++20/C++11 alternatives are valid for a better readability of the code, to choose between both alternatives will depend of how are you going to design your API, probably if you are more familiar with the Qt API, the second version appears to be the most common, the C++20/C++11 alternative appears to be a bit more verbose but can be useful to avoid multiple function calls and helpful when dealing with a simpler code base.

  • Encapsulate code when necessary, try to break functions in a more readable and explanatory way:

    ...// Original version void Serial::start_serial_communication() { // Check if we are open to talk if (!_port || _port->register() != 0xb0001) { log("Serial port is not open!"); return false; } // Send a 10ms serial break signal _port->set_break_enabled(true); msleep(10); _port->set_break_enabled(false); usleep(10); // Send intercalated binary for detection _port->write('U'); _port->flush(); // Send start AT command _port->write("AT+start"); } // Reviewer: Try to make it more readable encapsulating // some functionalities ...// Second version bool Serial::is_port_open() { return !_port || _port->register() != 0xb0001; } void Serial::force_baudrate_detection() { // Send a 10ms serial break signal _port->set_break_enabled(true); msleep(10); _port->set_break_enabled(false); usleep(10); // Send intercalated binary for detection _port->write('U'); _port->flush(); } void Serial::send_message(Message message_type) { _port->write(messageFromType(message_type)); } void Serial::start_serial_communication() { if (!is_port_open()) { log("Serial port is not open!"); return false; } force_baudrate_detection(); send_message(Message::Start) }

    As you can see, the reason behind each block of code is clear now, and with that, the comments are also not necessary anymore, the code is friendly and readable enough that's possible to understand it without any comments, the function name does the job for free.

  • Avoid comments, that's a clickbait, comments are really necessary, but they may be unnecessary when you are doing something that's really obvious, and sometimes when something is not obvious, it's better to encapsulate it.

    ...// Original version void blink_routine() { // Use the LED builtin const int led_builtin = LED_BUILTIN; // Configure ping to output setPinAsOutput(led_builtin); // Loop forever while(true) { // Turn the LED on turnPinOn(led_builtin); // Wait for a second wait_seconds(1); // Turn the LED off turnPinOff(led_builtin); // Wait for a second wait_seconds(1); } }

    Before checking the final version, let me talk more about it:

    • For each line of code you'll have a comment (like a parrot that repeat what we say), and the worst thing about these comments is that the content is exactly what you can read from the code! You can think that this kind of comment is dead code, something that has the same meaning as the code, but it does not run, resulting in a duplicated amount of lines to maintain. If you forget to update each comment for each line of code, you'll have a comment that does not match with the code, and this will be pretty confuse for someone that's reading it.

    • One of the most important skills about writing comments, is to know when not to write it!

    • A comment should bring a value to the code, if you can remove the comment and the code can be understandable by a newcomer, the comment is not important.

    ...// Final version void blink_routine() { const int led_builtin = LED_BUILTIN; setPinAsOutput(led_builtin); while(true) { turnPinOn(led_builtin); wait_seconds(1); turnPinOff(led_builtin); wait_seconds(1); } }

    There is a good video about this subject by Walter E. Brown in cppcon 2017, "Whitespace ≤ Comments << Code".

    And to finish, you should not avoid comments, you should understand when comments are necessary, like this:

    // From ArduPilot - GPIO_RPI void set_gpio_mode_alt(int pin, int alternative) { // **Moved content from cpp for this example** const uint8_t pins_per_register = 10; // Calculates the position of the 3 bit mask in the 32 bits register const uint8_t tree_bits_position_in_register = (pin%pins_per_register)*3; /** Creates a mask to enable the alternative function based in the following logic: * * | Alternative Function | 3 bits value | * |:--------------------:|:------------:| * | Function 0 | 0b100 | * | Function 1 | 0b101 | * | Function 2 | 0b110 | * | Function 3 | 0b111 | * | Function 4 | 0b011 | * | Function 5 | 0b010 | */ const uint8_t alternative_value = (alternative < 4 ? (alternative + 4) : (alternative == 4 ? 3 : 2)); // 0b00'000'000'000'000'000'000'ALT'000'000'000 enables alternative for the 4th pin const uint32_t mask_with_alt = static_cast<uint32_t>(alternative_value) << tree_bits_position_in_register; const uint32_t mask = 0b111 << tree_bits_position_in_register; // Clear all bits in our position and apply our mask with alt values uint32_t register_value = _gpio[pin / pins_per_register]; register_value &= ~mask; _gpio[pin / pins_per_register] = register_value | mask_with_alt; }

    Mostly of the lines in this code can be impossible to understand without access or reading the datasheet directly, the comments are here to understand what is going on and why, otherwise anyone that'll touch this code will need to do a reverse engineer to understand it.

Development flow
  • Avoid creating multiple Pull Requests (PRs), update the ones that are still open.

    • E.g: You created a Pull Request called "Add button feature", some modifications will be necessary after the review process, and for that you'll need to update the same branch over creating new ones. That's necessary to help the project maintainers to see previous comments and the development history. Creating multiple PRs will only make the maintainers confuse and unable to track old comments, suggestions and your code changes between PRs.
  • Create atomic and self-contained commits, avoid doing multiple tasks in the same commit.

    • E.g: You created a commit to fix the serial communication class, and inside the same commit you are doing 3 different tasks, removing trailing spaces, fixing a pointer validation check and a typo in the documentation of a different class. This appear to be silly and bureaucratic, but there are good reasons to break this simple commit and at least 3 different commits, one for the pointer check, a second one for the typo and a third one for the trailing space.

      Developers usually track lines history to understand the changes behind a functionality, it's common to search with grep history from commits or line changes in specific commits to understand the history of a library, function, class, or a small feature, if the commits start to be polluted with unnecessary changes, this development practice will be almost impossible to be done, since a bunch of unrelated lines will me changed between commits and this technic will be unable to help the dear developer. git blame will also be of little help.

      The example was also really simple, but you can imagine what happens if you change different parts of the code, for unrelated things, and a bug appears, technics such as git bisect will still work, but the result will be much harder to understand and to find which line is the one that created such bug.

  • Create atomic and self-sustained PRs, avoid doing multiple things in the same PR, like different features. They may appear simple with small pieces of code/functionality but they can escalate quickly after a review process, and if both features are somehow related or dependently, it's recommended to break it in multiple PRs with code to maintain compatibility with current code base.

    • E.g: You have applied a PR for software notifications, and somehow you also added a URL fetch functionality to grab new software versions from the server. After the first review, the maintainer asks to create a more abstracted way to fetch data from a REST API and to deal with network requirements, this will start to convolute the PR, moving the initial idea of the notification feature to an entire network REST API architecture. With that, it's better to break the PR in two, one that only provides the notification and a second PR that is used for the REST API related code.
  • Do your own review, the final and probably most important tip of all, doing that will train your mind and eyes to detect poor code standards or bad practices, it'll also make your PR be merged easily and faster, since you'll be able to catch problems before the reviewer feedback. Some reviewers may think that reviewing your own PR is a must, since we are talking about open source projects and free software, you should understand that the people that are reviewing your code are not obligated to do so, the majority are collaborating and donating their own time to help the development and maintenance of such projects, doing your own review is a sign of empathy about the project and maintainer time.

Final comment

This is the first post of a series that I'm planning to do. Hope that some of these points may help you in your journey.

References
Categories: FLOSS Project Planets

This week in KDE: window thumbnails on Wayland

Planet KDE - Sat, 2020-08-08 01:27

This week we got tons and tons of stuff done, including window thumbnails on Wayland! This much-awaited feature brings our Wayland session ever closer to parity with the X11 session. But wait, there’s more:

New Features

Konsole now lets you darken inactive terminals (Tomaz Canabrava, Konsole 20.12.0)

Task Manager window thumbnails now work on Wayland! (Aleix Pol Gonzalez, Plasma 5.20)

Discover can now be used to perform updates of content downloaded through the Get New Stuff dialogs (Dan Leinir Turthra Jensen, Plasma 5.20)

Plasma applets now feature an “About” page in their settings windows (David Redondo, Plasma 5.20)

Kate and other KTextEditor-based apps now show a zoom indicator in the status bar when the current zoom level is not 100% (Jan Paul Batrina, Frameworks 5.74)

Bugfixes & Performance Improvements

Opening an audio file from the filesystem in Elisa now works (Matthieu Gallien, Elisa 20.08.0)

Switching screens while in Okular’s Presentation Mode now works (David Hurka, Okular 20.08.0)

Fixed a case where KWin could crash when logging out of a Wayland session (Andrey Butirsky, Plasma 5.20)

In a Plasma Wayland session, XWayland no longer brings down the whole session when it crashes; it just restarts normally (Vlad Zahorodniy, Plasma 5.20)

Changing the list of active KRunner plugins now takes effect immediately rather than requiring KRunner to be restarted (Alexander Lohnau, Plasma 5.20)

The Search widget now respects the current list of active KRunner plugins (Alexander Lohnau, Plasma 5.20)

The mouse cursor no longer sometimes gets stuck when using screen rotation on Wayland (Aleix Pol Gonzalez, Plasma 5.20)

Edge swipe gestures and showing a hidden panel by tapping the screen edge now work on Wayland (Xaver Hugl, Plasma 5.20)

Adding a new network interface no longer messes up the display in the Networks system monitor (David Edmundson, Plasma 5.20)

Changing the systemwide scale factor now invalidates the Plasma SVG cache, causing SVG-based user interface elements throughout Plasma to be re-drawn with the correct scale, which should fix a wide variety of minor graphical glitches seen after changing the scale factor (David Edmundson, Frameworks 5.74)

The Baloo file indexer now skips files that repeatedly fail to index rather than repeatedly trying to re-index them anyway and failing in a loop that trashes your CPU (Stefan Brüns, Frameworks 5.74),

User Interface Improvements

When applying a tag to a file in Dolphin, if the tags menu only had one item in it, it now automatically closes after applying the tag (Ismael Asensio, Dolphin 20.08.0)

The current date is now shown in the Digital Clock applet by default (Claudius Ellsel, Plasma 5.20)

Animation speeds throughout the Breeze Widgets and Decoration themes now respect the global animation speed (Martin Sandsmark and Marco Martin, Plasma 5.20)

It’s now possible to do multiplication in KRunner using “x” as the multiplication operator, not just “*” (Alexander Lohnau, Plasma 5.20)

KRunner now shows tooltips for entries that don’t entirely fit, so you now have a way to read the dictionary text (Alexander Lohnau, Plasma 5.20)

And yes, multi-line output is coming soon as well

Minimizing a window no longer puts it at the very end of the Task Switcher; it now moves to the next position and there is no special handling (me: Nate Graham, Plasma 5.20)

Made various fixes and improvements to the Breeze GTK theme: Sidebars in GTK Assistant are now readable, floating status bars are no longer transparent, the window shadow now matches that of KDE apps, and pop-up shadows now look nicer (Carson Black, Plasma 5.20)

The Get New [Thing] Windows now display more appropriate icons for their Update and Uninstall actions (me: Nate Graham, Frameworks 5.74)

How You Can Help

Have a look at https://community.kde.org/Get_Involved to discover ways to be part of a project that really matters. Each contributor makes a huge difference in KDE; you are not a number or a cog in a machine! You don’t have to already be a programmer, either. I wasn’t when I got started. Try it, you’ll like it! We don’t bite!

Finally, consider making a tax-deductible donation to the KDE e.V. foundation.

Categories: FLOSS Project Planets

Dirk Eddelbuettel: RVowpalWabbit 0.0.15: Some More CRAN Build Issues

Planet Debian - Sat, 2020-08-08 00:55

Another maintenance RVowpalWabbit package update brought us to version 0.0.15 earlier today. We attempted to fix one compilation error on Solaris, and addressed a few SAN/UBSAN issues with the gcc build.

As noted before, there is a newer package rvw based on the excellent GSoC 2018 and beyond work by Ivan Pavlov (mentored by James and myself) so if you are into Vowpal Wabbit from R go check it out.

CRANberries provides a summary of changes to the previous version. More information is on the RVowpalWabbit page. Issues and bugreports should go to the GitHub issue tracker.

If you like this or other open-source work I do, you can now sponsor me at GitHub. For the first year, GitHub will match your contributions.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

Categories: FLOSS Project Planets

François Marier: Setting the default web browser on Debian and Ubuntu

Planet Debian - Sat, 2020-08-08 00:10

If you are wondering what your default web browser is set to on a Debian-based system, there are several things to look at:

$ xdg-settings get default-web-browser brave-browser.desktop $ xdg-mime query default x-scheme-handler/http brave-browser.desktop $ xdg-mime query default x-scheme-handler/https brave-browser.desktop $ ls -l /etc/alternatives/x-www-browser lrwxrwxrwx 1 root root 29 Jul 5 2019 /etc/alternatives/x-www-browser -> /usr/bin/brave-browser-stable* $ ls -l /etc/alternatives/gnome-www-browser lrwxrwxrwx 1 root root 29 Jul 5 2019 /etc/alternatives/gnome-www-browser -> /usr/bin/brave-browser-stable* Debian-specific tools

The contents of /etc/alternatives/ is system-wide defaults and must therefore be set as root:

sudo update-alternatives --config x-www-browser sudo update-alternatives --config gnome-www-browser

The sensible-browser tool (from the sensible-utils package) will use these to automatically launch the most appropriate web browser depending on the desktop environment.

Standard MIME tools

The others can be changed as a normal user. Using xdg-settings:

xdg-settings set default-web-browser brave-browser-beta.desktop

will also change what the two xdg-mime commands return:

$ xdg-mime query default x-scheme-handler/http brave-browser-beta.desktop $ xdg-mime query default x-scheme-handler/https brave-browser-beta.desktop

since it puts the following in ~/.config/mimeapps.list:

[Default Applications] text/html=brave-browser-beta.desktop x-scheme-handler/http=brave-browser-beta.desktop x-scheme-handler/https=brave-browser-beta.desktop x-scheme-handler/about=brave-browser-beta.desktop x-scheme-handler/unknown=brave-browser-beta.desktop

Note that if you delete these entries, then the system-wide defaults, defined in /etc/mailcap, will be used, as provided by the mime-support package.

Changing the x-scheme-handler/http (or x-scheme-handler/https) association directly using:

xdg-mime default brave-browser-nightly.desktop x-scheme-handler/http

will only change that particular one. I suppose this means you could have one browser for insecure HTTP sites (hopefully with HTTPS Everywhere installed) and one for HTTPS sites though I'm not sure why anybody would want that.

Summary

In short, if you want to set your default browser everywhere (using Brave in this example), do the following:

sudo update-alternatives --config x-www-browser sudo update-alternatives --config gnome-www-browser xdg-settings set default-web-browser brave-browser.desktop
Categories: FLOSS Project Planets

Jelmer Vernooij: Improvements to Merge Proposals by the Janitor

Planet Debian - Fri, 2020-08-07 20:30

The Debian Janitor is an automated system that commits fixes for (minor) issues in Debian packages that can be fixed by software. It gradually started proposing merges in early December. The first set of changes sent out ran lintian-brush on sid packages maintained in Git. This post is part of a series about the progress of the Janitor.

Since the original post, merge proposals created by the janitor now include the debdiff between a build with and without the changes (showing the impact to the binary packages), in addition to the merge proposal diff (which shows the impact to the source package).

New merge proposals also include a link to the diffoscope diff between a vanilla build and the build with changes. Unfortunately these can be a bit noisy for packages that are not reproducible yet, due to the difference in build environment between the two builds.

This is part of the effort to keep the changes from the janitor high-quality.

The rollout surfaced some bugs in lintian-brush; these have been either fixed or mitigated (e.g. by disabling specified fixers).

For more information the Janitor's lintian-fixes efforts, see the landing page

Categories: FLOSS Project Planets

Sandipan Dey: Simulating a Turing Machine with Python and executing programs on it

Planet Python - Fri, 2020-08-07 20:01
In this article, we shall implement a basic version of a Turing Machine in python and write a few simple programs to execute them on the Turing machine.  This article is inspired by the edX / MITx course Paradox and Infinity and few of the programs to be executed on the (simulated) Turing machine are … Continue reading Simulating a Turing Machine with Python and executing programs on it
Categories: FLOSS Project Planets

Antonio Terceiro: When community service providers fail

Planet Debian - Fri, 2020-08-07 18:00

I'm starting a new blog, and instead of going into the technical details on how it's made, or on how I migrated my content from the previous ones, I want to focus on why I did it.

It's been a while since I have written a blog post. I wanted to get back into it, but also wanted to finally self-host my blog/homepage because I have been let down before. And sadly, I was not let down by a for-profit, privacy invading corporation, but by a free software organization.

The sad story of wiki.softwarelivre.org

My first blog that was hosted in a blog engine written by me, which was hosted in a TWiki, and later Foswiki instance previously available at wiki.softwarelivre.org, hosted by ASL.org.

I was the one who introduced the tool to the organization in the first place. I had come from a previous, very fruitful experience on the use of wikis for creation of educational material while in university, which ultimately led me to become a core TWiki, and then Foswiki developer.

In 2004, I had just moved to Porto Alegre, got involved in ASL.org, and there was a demand for a tool like that. 2 years later, I left Porto Alegre, and some time after that the daily operations of ASL.org when it became clear that it was not really prepared for remote participation. I was still maintaing the wiki software and the OS for quite some years after, until I wasn't anymore.

In 2016, the server that hosted it went haywire, and there were no backups. A lot of people and free software groups lost their content forever. My blog was the least important content in there. To mention just a few examples, here are some groups that lost their content in there:

  • The Brazilian Foswiki user group hosted a bunch of getting started documentation in Portuguese, organized meetings, and coordinated through there.
  • GNOME Brazil hosted its homepage there until the moment the server broke.
  • The Inkscape Brazil user group had an amazing space there where they shared tutorials, a gallery of user-contributed drawings, and a lot more.

Some of this can still be reached via the Internet Archive Wayback Machine, but that is only useful for recovering content, not for it to be used by the public.

The announced tragedy of softwarelivre.org

My next blog after that was hosted at softwarelivre.org, a Noosfero instance also hosted by ASL.org. When it was introduced in 2010, this Noosfero instance became responsible for the main domain softwarelivre.org name. This was a bold move by ASL.org, and was a demonstration of trust in a local free software project, led by a local free software cooperative (Colivre).

I was a lead developer in the Noosfero project for a long time, and I was also involved in maintaining that server as well.

However, for several years there is little to no investment in maintaining that service. I already expect that it will probably blow up at some point as the wiki did, or that it may be just shut down on purpose.

On the responsibility of organizations

Today, a large part of wast mot people consider "the internet" is controlled by a handful of corporations. Most popular services on the internet might look like they are gratis (free as in beer), but running those services is definitely not without costs. So when you use services provided by for-profit companies and are not paying for them with money, you are paying with your privacy and attention.

Society needs independent organizations to provide alternatives.

The market can solve a part of the problem by providing ethical services and charging for them. This is legitimate, and as long as there is transparency about how peoples' data and communications are handled, there is nothing wrong with it.

But that only solves part of the problem, as there will always be people who can't afford to pay, and people and communities who can afford to pay, but would rather rely on a nonprofit. That's where community-based services, provided by nonprofits, are also important. We should have more of them, not less.

So it makes me worry to realize ASL.org left the community in the dark. Losing the wiki wasn't even the first event of its kind, as the listas.softwarelivre.org mailing list server, with years and years of community communications archived in it, broke with no backups in 2012.

I do not intend to blame the ASL.org leadership personally, they are all well meaning and good people. But as an organization, it failed to recognize the importance of this role of service provider. I can even include myself in it: I was member of the ASL.org board some 15 years ago; I was involved in the deployment of both the wiki and Noosfero, the former as a volunteer and the later professionally. Yet, I did nothing to plan the maintenance of the infrastructure going forward.

When well meaning organizations fail, people who are not willing to have their data and communications be exploited for profit are left to their own devices. I can afford a virtual private server, and have the technical knowledge to import my old content into a static website generator, so I did it. But what about all the people who can't, or don't?

Of course, these organizations have to solve the challenge of being sustainable, and being able to pay professionals to maintain the services that the community relies on. We should be thankful to these organizations, and their leadership needs to recognize the importance of those services, and actively plan for them to be kept alive.

Categories: FLOSS Project Planets

Antonio Terceiro: When community service providers fail

Planet Debian - Fri, 2020-08-07 18:00

I'm starting a new blog, and instead of going into the technical details on how it's made, or on how I migrated my content from the previous ones, I want focus on why I did it.

It's been a while since I have written a blog post. I wanted to get back into it, but also wanted to finally self-host my blog/homepage because I have been let down before. And sadly, I was not let down by a for-profit, privacy invading corporation, but by a free software organization.

The sad story of wiki.softwarelivre.org

My first blog that was hosted in a blog engine written by me, which was hosted in a TWiki, and later Foswiki instance previously available at wiki.softwarelivre.org, hosted by ASL.org.

I was the one who introduced the tool to the organization in the first place. I had come from a previous, very fruitful experience on the use of wikis for creation of educational material while in university, which ultimately led me to become a core TWiki, and then Foswiki developer.

In 2004, I had just moved to Porto Alegre, got involved in ASL.org, and there was a demand for a tool like that. 2 years later, I left Porto Alegre, and some time after that the daily operations of ASL.org when it became clear that it was not really prepared for remote participation. I was still maintaing the wiki software and the OS for quite some years after, until I wasn't anymore.

In 2016, the server that hosted it went haywire, and there were no backups. A lot of people and free software groups lost their content forever. My blog was the least important content in there. To mention just a few examples, here are some groups that lost their content in there:

  • The Brazilian Foswiki user group hosted a bunch of getting started documentation in Portuguese, organized meetings, and coordinated through there.
  • GNOME Brazil hosted its homepage there until the moment the server broke.
  • The Inkscape Brazil user group had an amazing space there where they shared tutorials, a gallery of user-contributed drawings, and a lot more.

Some of this can still be reached via the Internet Archive Wayback Machine, but that is only useful for recovering content, not for it to be used by the public.

The announced tragedy of softwarelivre.org

My next blog after that was hosted at softwarelivre.org, a Noosfero instance also hosted by ASL.org. When it was introduced in 2010, this Noosfero instance became responsible for the main domain softwarelivre.org name. This was a bold move by ASL.org, and was a demonstration of trust in a local free software project, led by a local free software cooperative (Colivre).

I was a lead developer in the Noosfero project for a long time, and I was also involved in maintaining that server as well.

However, for several years there is little to no investment in maintaining that service. I already expect that it will probably blow up at some point as the wiki did, or that it may be just shut down on purpose.

On the responsibility of organizations

Today, a large part of wast mot people consider "the internet" is controlled by a handful of corporations. Most popular services on the internet might look like they are gratis (free as in beer), but running those services is definitely not without costs. So when you use services provided by for-profit companies and are not paying for them with money, you are paying with your privacy and attention.

Society needs independent organizations to provide alternatives.

The market can solve a part of the problem by providing ethical services and charging for them. This is legitimate, and as long as there is transparency about how peoples' data and communications are handled, there is nothing wrong with it.

But that only solves part of the problem, as there will always be people who can't afford to pay, and people and communities who can afford to pay, but would rather rely on a nonprofit. That's where community-based services, provided by nonprofits, are also important. We should have more of them, not less.

So it makes me worry to realize ASL.org left the community in the dark. Losing the wiki wasn't even the first event of its kind, as the listas.softwarelivre.org mailing list server, with years and years of community communications archived in it, broke with no backups in 2012.

I do not intend to blame the ASL.org leadership personally, they are all well meaning and good people. But as an organization, it failed to recognize the importance of this role of service provider. I can even include myself in it: I was member of the ASL.org board some 15 years ago; I was involved in the deployment of both the wiki and Noosfero, the former as a volunteer and the later professionally. Yet, I did nothing to plan the maintenance of the infrastructure going forward.

When well meaning organizations fail, people who are not willing to have their data and communications be exploited for profit are left to their own devices. I can afford a virtual private server, and have the technical knowledge to import my old content into a static website generator, so I did it. But what about all the people who can't, or don't?

Of course, these organizations have to solve the challenge of being sustainable, and being able to pay professionals to maintain the services that the community relies on. We should be thankful to these organizations, and their leadership needs to recognize the importance of those services, and actively plan for them to be kept alive.

Categories: FLOSS Project Planets

Pages