Planet Python
Django Weblog: PyCharm & Django Campaign 2024 - encore
The Django Software Foundation's biggest fundraising event of the year is here!
Get 30% off PyCharm, Support Django
Each year, our friends at JetBrains, the creators of PyCharm, run an incredible deal. You get a 30% discounted year of PyCharm, AND the DSF gets 100% of the money. Yes, 100%! It's making a donation and directly getting a great product in return! This is available for new users, and those who had used PyCharm in the past, stopped, and want to try again.
The fundraiserThe fundraiser started during DjangoCon Europe in June, and is now back on from September 22nd to October 6th. Buy PyCharm and support Django!
In the past, JetBrains through the PyCharm fundraiser has provided approximately one quarter of the Django Software Foundation's budget!
Donations like this fundraiser allow the DSF to function. Our two wonderful Fellows, Natalia Bidart and Sarah Boyce keep Django running smoothly, picking up pieces that would otherwise not happen.
The other side of the DSF is our support for Django groups across the globe. We supported every DjangoCon, particularly with donating funding towards opportunity grants for more people to be able to attend these conferences. The DSF also supports smaller events around the world, including DjangoGirls events.
PyCharmFinally, I want to tell you about PyCharm itself.
PyCharm is an integrated development environment (IDE) that helps professional Python web developers be more productive, be more confident, and write better code. It supports the full Python web workflow out of the box, including popular Python web frameworks, such as Django, frontend technologies, and databases.
Here are the main benefits of using PyCharm in your Django development:
- Django (including templates), Flask, FastAPI
- Database management (Postgres, Redis)
- JS, React, Node.js, TailwindCSS
- Built-in HTTP Client and endpoint tools
Get Django work done with PyCharm, a powerful IDE tailored for Django web development!
Consider this the easiest charitable donation you will ever make, when you get such a great product in return!
Get 30% off PyCharm, Support Django
Other ways to donateIf you would like to donate in another way, especially if you are already a PyCharm customer, here are other ways to donate to the DSF:
- On our website via credit card
- Via GitHub Sponsors
- For those able to make a larger donation, particularly corporate sponsors ($2000+), more information is here: Corporate membership
Python Bytes: #402 How to monetize your blog
Quansight Labs Blog: Multi-dimensional Sparse Arrays in SciPy
Hynek Schlawack: Python Project-Local Virtualenv Management Redux
One of my first TIL entries was about how you can imitate Node’s node_modules semantics in Python on UNIX-like operating systems. A lot has happened since then (to the better!) and it’s time for an update. direnv still rocks, though.
Armin Ronacher: FSL: A Better Business/Open Source Balance Than AGPL
subtext: in my opinion, and for companies (and their users) that want a good balance between protecting their core business with Open Source ideals.
Following up to my thoughts on the case for funding Open Source, there is a second topic I want to discuss in more detail: Open Source and commercialization. As our founder likes to say: Open Source is not a business model. And indeed it really isn't. However, this does not mean that Open Source and Open Source licenses aren't a critical consideration for a technology company and a fascinating interconnection between the business model and license texts.
As some of you might know I'm a strong proponent of the concept now branded as “Fair Source” which we support at Sentry. Fair Source is defined by a family of springing licenses that give you the right to read and modify code, while also providing an exclusivity period for the original creator to protect their core business. After a designated time frame, the code transitions into Open Source via a process called DOSP: Delayed Open Source Publication. This is not an entirely new idea, and I have been writing about it a few times before [1] [2].
A recurring conversation I have in this context is the AGPL (Affero General Public License) as an alternative vehicle for balancing business goals and Open Source ideals. This topic also has resurfaced recently because of Elasticsearch'es Open Source, Again post where they announced that they will license Elasticsearch under the AGPL.
In my view, while AGPL is a true Open Source license, it is an inferior choice compared to the FSL (the Functional Source License, a Fair Source license) for many projects. Let me explain my reasoning.
The Single Vendor ModelWhen you take a project like Sentry, which started as an Open Source project and later turned into a VC funded company, its model revolves around a commercial entity being in charge. That model is often referred to as “single vendor.” This is also the case with companies like Clickhouse Inc. or Elastic and their respective projects.
Sentry today is no longer Open Source, it's Fair Source (FSL licensed). Elastic on the other hand is indeed unquestionable Open Source (AGPL among others). What both projects have in common is that they value brand (including trademarks), that they have strong opinions on how that project should be run, and they use a CLA to give themselves the right to re-licenses it under other terms.
In a "single vendor" setup, the company behind the project holds significant power (for ~150 years give or take).
The Illusion of EqualityWhen you look at the AGPL as a license it's easy to imagine that everybody is equal. Every contributor to a project agrees with the underlying license assumptions of the AGPL and acts accordingly. However, in practice, things are more complicated — especially when it comes to commercial usage. Many legal departments are wary of the AGPL and the broader GPL family of licenses. Some challenges are also inherent to the licenses such as not being able to publish *GPL code to the app store.
You can see this also with Elasticsearch. The code is not just AGPL licensed, you can also retrieve it under the ELv2 and SSPL licensing terms. Something that Elastic can do due to the CLAs in place.
Compare this to Linux, which is licensed under GPLv2 with a syscall exception. This very specific license was chosen by Linus Torvalds to ensure the project's continued success while keeping it truly open. In Linux' case, no single entity has more rights than anyone else. There is not even a realistic option to relicense to a newer version of the GPL.
The FSL explicitly recognizes the reality that the single vendor holds significant power but balances it by ensuring that this power diminishes over time. This idea can also be found in copyright law, where a creator's work eventually enters the public domain. A key difference with software though is that it continuously evolves, making it hard to pinpoint when it might eventually become public domain as thousands of people contribute to it.
The FSL is much more aggressive in that aspect. If we run Sentry into the ground and the business fails, within two years, anyone can pick up the pieces and revive it like a Phoenix from the ashes. This isn't just hypothetical. Bryan Cantrill recently mentioned the desire of Oxide forking CockroachDB once its BUSL change date kicks in. While that day hasn't come yet, it's a real possibility.
Dying CompaniesLet's face it: companies fail. I have no intentions for Sentry to be one of them, but you never know. Companies also don't just die just once, they can do so repeatedly. Xapian is an example I like to quote here. It started out as a GPL v2+ licensed search project called Muscat which was built at Cambridge. After several commercial acquisitions and transitions, the project eventually became closed source (which was possible because the creators held the copyright). Some of the original creators together with the community forked the last GPLv2 version into a project that eventually became known as Xapian.
What's the catch? The catch is that the only people who could license it more liberally than GPLv2 are long gone from the project. Xapian refers to its current license “a historical accident”. The license choice causes some challenges specifically to how Xapian is embedded. There are three remaining entities that would need to agree to the relicensing. From my understanding none of those entities commercially use Xapian's original code today but also have no interest in actually supporting a potential relicensing.
Unlike trademark law which has a concept of abandonment, the copyright situation is stricter. It would take two lifetimes for Xapian to enter the public domain and at that point it will be probably be mostly for archival purposes.
Equal Grounds Now or LaterIf Xapian's original code would have been FSL licensed, it would have been Apache 2.0 (or MIT with the alternative model) many times over. You don't need to hope that the original license holder still cares, by the time you get hold of the source code, you already have an irrevocable promise that it will eventually turn into Apache 2.0 (or MIT with the alternative license choice) which is about as non-strings attached as it can get.
So in some ways a comparison is “AGPL now and forever” vs “FSL now, Apache 2.0/MIT in two years”.
That's not to say that AGPL (or SSPL) don't have their merits. Xapian as much as it might suffer from their accidental license choice also is a successful Open Source project that helped a lot of developers out there. Maybe the license did in fact work out well for them, and because everybody is in the same boat it also has created a community of equals.
I do believe however it's important to recognize that “single-vendor AGPL with a CLA” is absolutely not the same as “community driven AGPL project without the CLA”.
The title claims that FSL balances Open Source better than AGPL, and it's fair to question how a license that isn't Open Source can achieve that. The key lies in understanding that Fair Source is built on the concept of delayed Open Source. Yes, there's a waiting period, but it’s a relatively short one: just two years. Count to two and the code transitions to full, unshackled openness. And that transition to Open Source is a promise that can't be taken from you.
[1]Originally about the BUSL license which introduced the idea (Open Source, SaaS and Monetization) [2]Later about our own DOSP based license, the FSL (FSL: A License For the Bazaar, Not the Cathedral).Seth Michael Larson: PyCon Taiwan 2024 Keynote
Published 2024-09-22 by Seth Larson
Reading time: minutes
Here are my slides and overview of my PyCon Taiwan 2024 Keynote titled "Bytes, Pipes, and People". The video will be published to YouTube, subscribe to the PyCon Taiwan YouTube channel to be notified when available.
Software security has historically been treated as extra or "nice-to-have", not a core feature that users expect. This means we have accumulated plenty of tech debt. Now there are growing incentives and requirements for producing secure software to meet user expectations.
Luckily for us, many of the tools, data, and systems already exist to help us build a culture of security for Python. These tools help relay messages between software creators and users so we can collaborate on this shared goal.
By actively participating you are starting the positive feedback loop of software security, making users safer faster!
Below is a list of items that actions can implement to build a culture of security for Python:
Maintainers- Adopt Trusted Publishers if you use GitHub Actions, GitLab CI/CD, Google Cloud Build, or ActiveState to publish Python packages.
- Use lock files for the build and publish workflow, such as pip-tools, Poetry, or PDM.
- Adopt a lightweight security policy. Do not stress about CVEs: fix, release, publish a CVE.
- Contribute new insecure code detections to Bandit.
- Update dependencies that have vulnerabilities. Prioritize projects that are connected to the internet.
- Update software on a semi-regular basis to avoid out-of-date and end-of-life software. Staying up-to-date helps you being able to upgrade to fixed versions in the future.
- Run tests with PYTHONWARNINGS with DeprecationWarning and PendingDeprecationWarning set to errors to avoid missing deprecated features.
- Create a secure open source usage policy, using verified data to evaluate open source projects. Do not install new projects without checking your policy first.
- If you need a Software Bill-of-Materials document there are tools available to generate one. Those tools will improve over time from new Python package SBOM standards.
- Add a vulnerability scanner like pip-audit, Grype, or Trivy.
- What is Software Bill-of-Materials ("SBOM")?
- Trusted Publishers
- PyPI blog
- Bandit
- Warnings in Python
- pip-audit
- Scientific Python SPEC-8: "Securing the Release Process"
- Supply chain security threats (SLSA)
- Grype
- Trivy
- Ecosystem.ms
- Libraries.io
- Deps.dev
- Trusty
- pip-tools
- Poetry
- PDM
- uv
- Sigstore Python
- Yanking Python packages
- HTTP Archive
- Sonatype 2023 Annual State of Software Supply Chain Report
- Kushal Das for Python Language Summit photos
- StackOverflow
Thanks for reading! ♡ Did you find this article helpful and want more content like it?
Get notified of new posts by subscribing to the RSS feed or the email newsletter.
This work is licensed under CC BY-SA 4.0
Python Morsels: Prompting a user for input
We can prompt our users for input with Python's built-in input function.
Table of contents
- Prompting for user input
- Customizing the prompt text
- Prompt users with Python's built-in input function
Python has a built-in input function that we can use to prompt a user of our program to enter some text:
>>> color = input("Favorite color:") Favorite color:▯Our Python REPL is now hanging and waiting for us (the user) to input text. Let's type purple:
>>> color = input("Favorite color:") Favorite color:purple▯After the user has typed the text they'd like to enter, they can hit the Enter key to lock-in the value.
>>> color = input("Favorite color:") Favorite color:purple >>>Now the color variable contains the string purple:
>>> color 'purple' Customizing the prompt textNote that the input function …
Read the full article: https://www.pythonmorsels.com/prompting-for-input/Erik Marsja: Using Pandas to Read JSON from URL
The post Using Pandas to Read JSON from URL appeared first on Erik Marsja.
When working with data in Python, using Pandas to read JSON from URL is an excellent tool that lets you directly load JSON data from a web source into a Pandas dataframe. This tutorial will teach you the steps to accomplish this task, building upon our previous discussions on reading JSON with Python more generally.
Table of Contents How to use Pandas to Read JSON from URLFirst, let us look at a simple example of using Pandas to read JSON from a URL.
import pandas as pd # URL containing JSON data url = "http://api.open-notify.org/astros.json" # Read JSON data from URL into a DataFrame df = pd.read_json(url) # Display the dataframe print(df)In the code chunk above, we start by importing the Pandas library. The URL variable contains the web address where the JSON data is hosted. The pd.read_json(url) function is then used to read the JSON data from the URL and load it into a Pandas DataFrame, which is a two-dimensional labeled data structure with columns of potentially different types. Finally, print(df) displays the DataFrame, allowing us to see the imported data in tabular format.
Now that we have seen a basic example, let us learn more about the parameters of the pd.read_json() method to understand how we can customize the reading process.
The pd.read_json() method has several parameters that allow you to fine-tune how the JSON data is read and converted into a dataframe. Here is an overview of the most important parameters:
- path_or_buf: The string containing the URL or the path to the JSON file. This is the source of the JSON data that will be read.
- orient: Defines the expected JSON string format. Default is ‘columns’. This parameter specifies the orientation of the JSON data. Other options include ‘split’, ‘records’, ‘index’, and ‘values’.
- typ: Specifies the type of object to be returned. Default is ‘frame’. This parameter can be set to ‘series’ if you want to return a Series instead of a DataFrame.
- dtype: Determines whether to infer types of objects. Default is ‘None’. This parameter can be used to specify the data type for each column.
- convert_axes: Whether to convert the axes to another type. Default is ‘True’. This parameter allows you to convert the axes to a specified data type.
- convert_dates: List of columns to convert to dates. Default is ‘True’. This parameter can be used to specify which columns should be parsed as dates.
- keep_default_dates: Whether to include default date parsers. Default is ‘True’. This parameter determines whether to use the default date parsers provided by Pandas.
- precise_float: Whether to use a high precision floating point converter. Default is ‘False’. This parameter can be set to ‘True’ if you need high precision for float values.
- date_unit: Unit for encoding datetime. Default is ‘None’. This parameter can be used to specify the time unit for encoding datetime objects.
- encoding: Specifies the encoding to be used. Default is ‘utf-8’. This parameter determines the encoding for reading the JSON data.
- lines: Whether to read the JSON file as a JSON object per line. Default is ‘False’. This parameter can be set to ‘True’ if the JSON data is in a line-delimited format.
With these parameters allows you to better control how JSON data is read and processed, enabling you to tailor the DataFrame to your needs.
SummaryTo summarize, we have learned how to use Pandas to read JSON data from a URL. We explored a practical example and detailed the parameters of the pd.read_json() method, enhancing our ability to customize the data reading process. Handling nested JSON data can be more challenging, but that will be covered in a future post.
I would appreciate it if you could share this post and leave your comments below. Your feedback is invaluable!
Here are some other reading data-related tutorials:
- How to Convert JSON to Excel in Python with Pandas
- How to use Pandas read_html to Scrape Data from HTML Tables
The post Using Pandas to Read JSON from URL appeared first on Erik Marsja.
Real Python: The Real Python Podcast – Episode #221: Thriving as a Developer With ADHD
What are strategies for being a productive developer with ADHD? How can you help your team members with ADHD to succeed and complete projects? This week on the show, we speak with Chris Ferdinandi about his website and podcast "ADHD For the Win!"
[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]
PyCharm: What’s New in PyCharm 2024.2.2!
PyCharm 2024.2.2 is here with many key updates, including Python support improvements, new Django features, and enhancements to the Data View tool window!
Visit our What’s New page for more details on all these features and to explore many others. You can download the latest version from our download page or update your current version through our free Toolbox App.
Download PyCharm 2024.2.2 PyCharm 2024.2.2 highlights Django enhancements PRO New code completion suggestionsWhen working with models, PyCharm now offers field completion suggestions in a variety of cases, such as Model.save(update_fields[…]), Model.refresh_from_db(fields=[…]), Model.clean_fields(exclude=[…]), and so on.
Quick-fix to create a method for an unresolved ViewSetIf a ViewSet has an unresolved reference, PyCharm suggests a quick-fix to introduce the missing method. Use Alt + Enter to call it.
What’s new! Data View PROYou can now look at n-dimensional NumPy arrays in the Data View tool window. Define the array you would like to inspect, along with a specific dimension or slice, in a special field at the bottom of the tool window, and PyCharm will display a table with the results.
Python support improvements Support for default types for type parameters (PEP 696)Improve typing with PyCharm’s support for the Python 3.13 ability to define the default types for type parameters. The IDE now incorporates default types for type parameters both for old-style and new-style generic classes, functions, and type aliases, and it takes them into account in type inference.
Pattern matching: Foldable match statementsTo improve the readability of code with large pattern-matching statements, you can now use folding for entire match statements or for separate cases inside them.
Download PyCharm 2024.2.2Visit our What’s New page to learn about other useful features included in this release, or read the release notes for the full breakdown, including more details on the features mentioned here.
If you encounter any problems, please report them in our issue tracker so we can address them promptly.
Connect with us on X (formerly Twitter) to share your thoughts on PyCharm 2024.2.2!
Talk Python to Me: #477: Awesome Text Tricks with NLP and spaCy
Wingware: Wing Python IDE Version 10.0.6 - September 20, 2024
Wing 10.0.6 adds support for Python 3.13 and fixes some issues with AI development, code refactoring, and unit testing with pytest.
See the change log for details.
Download Wing 10 Now: Wing Pro | Wing Personal | Wing 101 | Compare Products
What's New in Wing 10
AI Assisted Development
Wing Pro 10 takes advantage of recent advances in the capabilities of generative AI to provide powerful AI assisted development, including AI code suggestion, AI driven code refactoring, description-driven development, and AI chat. You can ask Wing to use AI to (1) implement missing code at the current input position, (2) refactor, enhance, or extend existing code by describing the changes that you want to make, (3) write new code from a description of its functionality and design, or (4) chat in order to work through understanding and making changes to code.
Examples of requests you can make include:
"Add a docstring to this method" "Create unit tests for class SearchEngine" "Add a phone number field to the Person class" "Clean up this code" "Convert this into a Python generator" "Create an RPC server that exposes all the public methods in class BuildingManager" "Change this method to wait asynchronously for data and return the result with a callback" "Rewrite this threaded code to instead run asynchronously"Yes, really!
Your role changes to one of directing an intelligent assistant capable of completing a wide range of programming tasks in relatively short periods of time. Instead of typing out code by hand every step of the way, you are essentially directing someone else to work through the details of manageable steps in the software development process.
Support for Python 3.12, 3.13, and ARM64 LinuxWing 10 adds support for Python 3.12 and 3.13, including (1) faster debugging with PEP 669 low impact monitoring API, (2) PEP 695 parameterized classes, functions and methods, (3) PEP 695 type statements, and (4) PEP 701 style f-strings.
Wing 10 also adds support for running Wing on ARM64 Linux systems.
Poetry Package ManagementWing Pro 10 adds support for Poetry package management in the New Project dialog and the Packages tool in the Tools menu. Poetry is an easy-to-use cross-platform dependency and package manager for Python, similar to pipenv.
Ruff Code Warnings & ReformattingWing Pro 10 adds support for Ruff as an external code checker in the Code Warnings tool, accessed from the Tools menu. Ruff can also be used as a code reformatter in the Source > Reformatting menu group. Ruff is an incredibly fast Python code checker that can replace or supplement flake8, pylint, pep8, and mypy.
Try Wing 10 Now!
Wing 10 is a ground-breaking new release in Wingware's Python IDE product line. Find out how Wing 10 can turbocharge your Python development by trying it today.
Downloads: Wing Pro | Wing Personal | Wing 101 | Compare Products
See Upgrading for details on upgrading from Wing 9 and earlier, and Migrating from Older Versions for a list of compatibility notes.
Matt Layman: Docker Go, JS, Static Files - Building SaaS #203
Programiz: Getting Started with Python
PyCharm: How to Use FastAPI for Machine Learning
This is a guest post from Cheuk Ting Ho, a data scientist who contributes to multiple open-source libraries, such as pandas, Polars, and Jupyter Notebook.
FastAPI provides a quick way to build a backend service with Python. With a few decorators, you can turn your Python function into an API application.
It is widely used by many companies including Microsoft, Uber, and Netflix. According to the Python Developers Survey, FastAPI usage has grown from 21% in 2021 to 29% in 2023. For data scientists, it’s the second most popular framework, with 31% using it.
In this blog post, we will cover the basics of FastAPI for data scientists who may want to build a quick prototype for their project.
What is FastAPI?FastAPI is a popular web framework for building APIs with Python, based on standard Python type hints. It is intuitive and easy to use, and it can provide a production-ready application in a short period of time. It is fully compatible with OpenAPI and JSON Schema.
Why use FastAPI for machine learning?Most teams working on machine learning projects consist of data scientists whose domains and professions lie on the statistics side of things. They may not have experience developing software or applications to ship their machine learning projects. FastAPI enables data scientists to easily create APIs for the following projects:
Deploying prediction models
The data science team may have trained a model for the prediction of the sales demand in a warehouse. To make it useful, they have to provide an API interface so other parts of the stock management system can use this new prediction functionality.
Suggestion engines
One of the very common uses of machine learning is as a system that provides suggestions based on the users’ choices. For example, if someone puts certain products in their shopping cart, more items can be suggested to that user. Such an e-commerce system requires an API call to the suggestion engine that takes input parameters.
Dynamic dashboards and reporting systems
Sometimes, reports for data science projects need to be presented as dashboards so users can inspect the results themselves. One possible approach is to have the data model provide an API. Frontend developers can use this API to create applications that allow users to interact with the data.
Advantages of using FastAPICompared to other Python web frameworks, FastAPI is simple yet fully functional. Mainly using decorators and type hints, it allows you to build a web application without the complexity of building a whole ORM (object-relational mapping) model and with the flexibility of using any database, including any SQL and NoSQL databases. FastAPI also provides automatic documentation generation, support for additional information and validation for query parameters, and good async support.
Fast development
Creating API calls in FastAPI is as easy as adding decorators in the Python code. Little to no backend experience is needed for anyone who wants to turn a Python function into an application that will respond to API calls.
Fast documentation
FastAPI provides automatic interactive API documentation using Swagger UI, which is an industry standard. No extra effort is required to build clear documentation with API call examples. This creates an advantage for busy data science teams who may not have the energy and expertise to write technical specifications and documentation.
Easy testing
Writing tests is one of the most important steps in software development, but it can also be one of the most tedious, especially when the time of the data science team is valuable. Testing FastAPI is made simple thanks to Starlette and HTTPX. Most of the time no monkey patching is needed and tests are easy to write and understand.
Fast deployment
FastAPI comes with a CLI tool that can bridge development and deployment smoothly. It allows you to switch between development mode and production mode easily. Once development is completed, the code can be easily deployed using a Docker container with images that have Python prebuilt.
How to use FastAPI for a machine learning projectIn this example, we will turn a classification prediction model that uses the Nearest Neighbors algorithm to predict the species of various penguins based on their bill and flipper length into a backend application. We will provide an API that takes parameters from the query parameters of a URL and gives back the prediction. This shows how a prototype can be made quickly by any data scientist with no backend development experience.
We will use a simple `KNeighborsClassifier` on the penguin data set as an example. Details of how to build the model will be omitted, but feel free to check out the relevant notebook here. In the following tutorial, we will focus on the usage of FastAPI and explain some fundamental concepts. We will be building a prototype to do so.
1. Start a FastAPI project with PyCharmIn this blog post, we will be using PyCharm Professional 2024.1. The best way to start using FastAPI is to create a FastAPI project with PyCharm. When you click New Project in PyCharm, you will be presented with a large selection of projects to choose from. Select the FastAPI tab:
From here, you can put in the name of your project and take advantage of other options such as initializing Git and the virtual environment that you want to use.
After doing so, you will see the basic structure of a FastAPI project set up for you.
There is also a `test_main.http` file set up for you to quickly test all the endpoints.
Next, set up our environment dependency with `requirements.txt` by selecting Sync Python Requirements under PyCarm’s Tool menu.
Then you can select the `requirements.txt` file to be used.
You can copy and use this `requirements.txt` file. We will be using pandas and scikit-learn for the machine learning part of the project. Also, add the `penguins.csv` file to your project directory.
Arrange your machine learning code in the `main.py` file. We will start with a script that trains our model:
import pandas as pd from sklearn.model_selection import train_test_split from sklearn import preprocessing from sklearn.neighbors import KNeighborsClassifier from sklearn.pipeline import Pipeline from sklearn.preprocessing import StandardScaler data = pd.read_csv('penguins.csv') data = data.dropna() le = preprocessing.LabelEncoder() X = data[["bill_length_mm", "flipper_length_mm"]] le.fit(data["species"]) y = le.transform(data["species"]) X_train, X_test, y_train, y_test = train_test_split(X, y, stratify=y, random_state=0) clf = Pipeline( steps=[("scaler", StandardScaler()), ("knn", KNeighborsClassifier(n_neighbors=11))] ) clf.set_params().fit(X_train, y_train)We can place the above code after `app = FastAPI()`. All of it will be run when we start the application.
However, there is a better way to run the start-up code we used to set up our model. We will cover that in a later part of the blog post.
Next we will look at how to add our model to FastAPI functionality. As a first step, we will add a response to the root of the URL and just simply return a message about our model in JSON format. Change the code in `async def root():` from “Hello world” to our message like this:
@app.get("/") async def root(): return { "Name": "Penguins Prediction", "description": "This is a penguins prediction model based on the bill length and flipper length of the bird.", }Now, test our application. First, we will start our application, which is easy in PyCharm. Just press the arrow button () next to your project name at the top.
If you are using the default settings, your application will run on http://127.0.0.1:8000. You can double-check that by looking at the prompt from the Run window.
Once the process has started, let’s go to `test_main.http` and press the first arrow button () next to `GET`. From the HTTP Client in the Services window, you will see the response message that we put in.
The response JSON file is also saved for future inspection.
Next, we would like to let users make predictions by providing query parameters in the URL. Let’s add the code below after the `root` function.
@app.get("/predict/") async def predict(bill_length_mm: float = 0.0, flipper_length_mm: float = 0.0): param = { "bill_length_mm": bill_length_mm, "flipper_length_mm": flipper_length_mm } if bill_length_mm <=0.0 or flipper_length_mm <=0.0: return { "parameters": param, "error message": "Invalid input values", } else: result = clf.predict([[bill_length_mm, flipper_length_mm]]) return { "parameters": param, "result": le.inverse_transform(result)[0], }Here we set the default value of the `bill_length_mm` and `flipper_length_mm` to be 0 if the user didn’t input a value. We also add a check to see if either of the values is 0 and return an error message instead of trying to predict which penguin the input refers to.
If the inputs are not 0, we will use the model to make a prediction and use the encoder to do an inverse transformation to get the label of the predicted target, i.e. the name of the penguin species.
This is not the only way you can verify inputs. You can also consider using Pydantic for input verification.
If you are using the same version of FastAPI as stated in `requirements.txt`, FastAPI automatically refreshes the service and applies changes on save. Now put in a new URL in `test_main.http` to test (separated from the URL before with ###):
### GET http://127.0.0.1:8000/predict/?bill_length_mm=40.3&flipper_length_mm=195 Accept: application/jsonPress the arrow button () next to our new URL and see the output.
Next you can try a URL with one or both of the parameters removed to see the error message:
Last, let’s look at how we can set up our model with FastAPI lifespan events. The advantage of doing that is we can make sure no request will be accepted while the model is still being set up and the memory used will be cleaned up afterward. To do that, we will use an `asynccontextmanager`. Before `app = FastAPI()` we will add:
from contextlib import asynccontextmanager ml_models = {} @asynccontextmanager async def lifespan(app: FastAPI): # Set up the ML model here yield # Clean up the models and release resources ml_models.clear()Now we will move the import of pandas and scikit-learn to be alongside the other imports. We will also move our setup code inside the `lifespan` function, setting the machine learning model and LabelEncoder inside `ml_models` like this:
from fastapi import FastAPI from contextlib import asynccontextmanager import pandas as pd from sklearn.model_selection import train_test_split from sklearn import preprocessing from sklearn.neighbors import KNeighborsClassifier from sklearn.pipeline import Pipeline from sklearn.preprocessing import StandardScaler ml_models = {} @asynccontextmanager async def lifespan(app: FastAPI): # Set up the ML model here data = pd.read_csv('penguins.csv') data = data.dropna() le = preprocessing.LabelEncoder() X = data[["bill_length_mm", "flipper_length_mm"]] le.fit(data["species"]) y = le.transform(data["species"]) X_train, X_test, y_train, y_test = train_test_split(X, y, stratify=y, random_state=0) clf = Pipeline( steps=[("scaler", StandardScaler()), ("knn", KNeighborsClassifier(n_neighbors=11))] ) clf.set_params().fit(X_train, y_train) ml_models["clf"] = clf ml_models["le"] = le yield # Clean up the models and release resources ml_models.clear()After that we will add the `lifespan=lifespan` parameter in `app = FastAPI()`:
app = FastAPI(lifespan=lifespan)Now save and test again. Everything should work and we should see the same result as before.
Afterthought: When to train the model?From our example, you may wonder when the model is trained. Since `clf` is trained at the beginning, i.e. when the service is launched, you may wonder why we do not train the model every time someone makes a prediction.
We do not want the model to be trained every time someone makes a call, because it costs way more resources to re-train everything. Additionally, it may cause race conditions since our FastAPI application is working concurrently. This is especially the case if we use live data that changes all the time.
Technically, we can set up an API to collect data and re-train the model (which we will demonstrate in the next example). Other options would be to schedule a re-train at a certain time when a certain amount of new data has been collected or to let a super user upload new data and trigger the re-training.
So far, we are aiming to build a prototype that runs locally. Check out this article on deploying a FastAPI project on a cloud service for more information.
What is concurrency?To put it simply, concurrency is like when you are cooking in the kitchen, and while waiting for the water to boil, you go ahead and chop the vegetables. Since, in the web service world, the server is talking to many terminals, and the communication between the server and the terminals is slower than most internal applications, so the server will not talk to and serve the terminals one by one. Instead, it will talk to and serve many of them at the same time while fulfilling their requests. You may want to check out this explanation in the FastAPI documentation.
In Python, this is achieved by using async code. In our FastAPI code, the use of `async def` instead of `def` is obvious evidence that FastAPI is working concurrently. There are other keywords used in Python async code, like `await` and `asyncio.get_event_loop`, but we won’t be able to cover them in this blog post.
How to use FastAPI for an image classification projectTo discover more FastAPI functionality, we will add an image classification model based on the MNIST example in Keras to our application as well (we are using the TensorFlow backend). If you installed the `requirements.txt` provided, you should have Keras and Pillow installed for image processing and building a convolutional neural network (CNN).
1. RefactoringBefore we start, let’s refactor our code. To make the code more organized, we will put the model setup for the penguins prediction in a function:
def penguins_pipeline(): data = pd.read_csv('penguins.csv') data = data.dropna() le = preprocessing.LabelEncoder() X = data[["bill_length_mm", "flipper_length_mm"]] le.fit(data["species"]) y = le.transform(data["species"]) X_train, X_test, y_train, y_test = train_test_split(X, y, stratify=y, random_state=0) clf = Pipeline( steps=[("scaler", StandardScaler()), ("knn", KNeighborsClassifier(n_neighbors=11))] ) clf.set_params().fit(X_train, y_train) return clf, leThen we rewrite the lifespan function. With full-line code completion in PyCharm, it is very easy:
2. Set up a CNN model for MNIST predictionIn similar fashion as the penguin prediction model, we create a function for MNIST prediction (and we will store the meta parameters globally):
# MNIST model meta parameters num_classes = 10 input_shape = (28, 28, 1) batch_size = 128 epochs = 15 def mnist_pipeline(): # Load the data and split it between train and test sets (x_train, y_train), _ = keras.datasets.mnist.load_data() # Scale images to the [0, 1] range x_train = x_train.astype("float32") / 255 # Make sure images have shape (28, 28, 1) x_train = np.expand_dims(x_train, -1) # convert class vectors to binary class matrices y_train = keras.utils.to_categorical(y_train, num_classes) model = keras.Sequential( [ keras.Input(shape=input_shape), layers.Conv2D(32, kernel_size=(3, 3), activation="relu"), layers.MaxPooling2D(pool_size=(2, 2)), layers.Conv2D(64, kernel_size=(3, 3), activation="relu"), layers.MaxPooling2D(pool_size=(2, 2)), layers.Flatten(), layers.Dropout(0.5), layers.Dense(num_classes, activation="softmax"), ] ) model.compile(loss="categorical_crossentropy", optimizer="adam", metrics=["accuracy"]) model.fit(x_train, y_train, batch_size=batch_size, epochs=epochs, validation_split=0.1) return modelThen add the model setup in the lifespan function:
ml_models["cnn"] = mnist_pipeline()Note that since this is added, every time you make changes to `main.py` and save, the model will be trained again. It can take a bit of time. So in development you may want to use a dummy model that requires no training time at all or a pre-trained model instead. After training, the CNN model will be ready to go.
3. Set up a POST endpoint for uploading an image file for predictionTo set up an endpoint that takes an upload file, we have to use UploadFile in FastAPI:
@app.post("/predict-image/") async def predicct_upload_file(file: UploadFile): img = await file.read() # process image for prediction img = Image.open(BytesIO(img)).convert('L') img = np.array(img).astype("float32") / 255 img = np.expand_dims(img, (0, -1)) # predict the result result = ml_models["cnn"].predict(img).argmax(axis=-1)[0] return {"filename": file.filename, "result": str(result)}Please note that this is a POST endpoint (so far we have only set up GET endpoints).
Don’t forget to import `UploadFile` from `fastapi`:
from fastapi import FastAPI, UploadFileAnd `Image` from Pillow. We are also using `BytesIO` from the `io` module:
from PIL import Image from io import BytesIOTo test this using the PyCharm HTTP Client with a test image file, we will make use of the `multipart/form-data` encoding. You can check out the HTTP request syntax here. This is what you will put in the `test_in.http` file:
### POST http://127.0.0.1:8000/predict-image/ HTTP/1.1 Content-Type: multipart/form-data; boundary=boundary --boundary Content-Disposition: form-data; name="file"; filename="test_img0.png" < ./test_img0.png --boundary– 4. Add an API to collect data and trigger retrainingNow, here comes the retraining. We set up a POST endpoint like above to accept a zip file which contains training images and labels. The zip file will then be processed and the training data will be prepared. After that we will fit the CNN model again:
@app.post("/upload-images/") async def retrain_upload_file(file: UploadFile): img_files = [] labels_file = None train_img = None with ZipFile(BytesIO(await file.read()), 'r') as zfile: for fname in zfile.namelist(): if fname[-4:] == '.txt' and fname[:2] != '__': labels_file = fname elif fname[-4:] == '.png': img_files.append(fname) if len(img_files) == 0: return {"error": "No training images (png files) found."} else: for fname in sorted(img_files): with zfile.open(fname) as img_file: img = img_file.read() # process image img = Image.open(BytesIO(img)).convert('L') img = np.array(img).astype("float32") / 255 img = np.expand_dims(img, (0, -1)) if train_img is None: train_img = img else: train_img = np.vstack((train_img, img)) if labels_file is None: return {"error": "No training labels file (txt file) found."} else: with zfile.open(labels_file) as labels: labels_data = labels.read() labels_data = labels_data.decode("utf-8").split() labels_data = np.array(labels_data).astype("int") labels_data = keras.utils.to_categorical(labels_data, num_classes) # retrain model ml_models["cnn"].fit(train_img, labels_data, batch_size=batch_size, epochs=epochs, validation_split=0.1) return {"message": "Model trained successfully."}Remember to import `ZipFile`:
from zipfile import ZipFileIf we now try the endpoint with this zip file of 1000 retraining images and labels, you will see that it takes a moment for the response to come, as the training is taking a while:
POST http://127.0.0.1:8000/upload-images/ HTTP/1.1 Content-Type: multipart/form-data; boundary=boundary --boundary Content-Disposition: form-data; name="file"; filename="training_data.zip" < ./retrain_img.zip --boundary--Imagine the zip files contain more training data or you’re retraining a more complicated model. The user would then have to wait for a long time and it would seem like things are not working for them.
5. Retrain the model with BackgroundTasksA better way to handle retraining is, after receiving the training data, we process it and check if the data is in the right format, then give a response saying that the retraining has restarted and train the model in `BackgroundTasks`. Here is how to do it. First, we will add `BackgroundTasks` to our `upload-images` endpoint:
@app.post("/upload-image/") async def retrain_upload_file(file: UploadFile, background_tasks: BackgroundTasks): ...Remember to import it from `fastapi`:
from fastapi import FastAPI, UploadFile, BackgroundTasksThen, we will put the fitting of the model into the `background_tasks`:
# retrain model background_tasks.add_task( ml_models["cnn"].fit, train_img, labels_data, batch_size=batch_size, epochs=epochs, validation_split=0.1 )Also, we will update the message in the response:
return {"message": "Data received successfully, model training has started."}Now test the endpoint again. You will see that the response has arrived much quicker, and if you look at the Run window, you’ll see that the training is running after the response has arrived.
At this point, more functionality can be added, for example, an option to notify the user later (e.g. via email) when the training is finished or track the training progress in a dashboard when a full application is built.
FastAPI provides an easy way to convert your data science project into a working application in several easy steps. It is perfect for data science teams that want to provide an application prototype for their machine learning model which can be further developed into a professional web application if needed.
PyCharm Professional is the Python IDE that allows you to develop FastAPI applications more easily with a preconfigured project for FastAPI, coding assistance, tailored run/debug configurations, and the Endpoints tool window for managing API endpoints efficiently.
Get a free trial of PyCharm ProfessionalIn this blog post, we showed the process of providing a simple API for a pre-trained prediction model. To learn more about FastAPI, I would suggest checking out the official FastAPI documentation. If you’re choosing between different frameworks, explore how FastAPI differs from Django.
About the author Cheuk Ting HoCheuk has been a Data Scientist at various companies – a job that demands high numerical and programming skills, especially in Python. Following her passion for the tech community, Cheuk has been a Developer Advocate for three years. She also contributes to multiple open-source libraries like Hypothesis, Pytest, pandas, Polars, PyO3, Jupyter Notebook, and Django. Cheuk is currently a consultant and trainer at CMD Limes.
Stack Abuse: Securing Your Email Sending With Python: Authentication and Encryption
Email encryption and authentication are modern security techniques that you can use to protect your emails and their content from unauthorized access.
Everyone, from individuals to business owners, uses emails for official communication, which may contain sensitive information. Therefore, securing emails is important, especially when cyberattacks like phishing, smishing, etc. are soaring high.
In this article, I'll discuss how to send emails in Python securely using email encryption and authentication.
Setting Up Your Python EnvironmentBefore you start creating the code for sending emails, set up your Python environment first with the configurations and libraries you'll need.
You can send emails in Python using:
-
Simple Mail Transfer Protocol (SMTP): This application-level protocol simplifies the process since Python offers an in-built library or module (smtplib) for sending emails. It's suitable for businesses of all sizes as well as individuals to automate secure email sending in Python. We're using the Gmail SMTP service in this article.
-
An email API: You can leverage a third-party API like Mailtrap Python SDK, SendGrid, Gmail API, etc., to dispatch emails in Python. This method offers more features and high email delivery speeds, although it requires some investment.
In this tutorial, we're opting for the first choice - sending emails in Python using SMTP, facilitated by the smtplib library. This library uses the RFC 821 protocol and interacts with SMTP and mail servers to streamline email dispatch from your applications. Additionally, you should install packages to enable Python email encryption, authentication, and formatting.
Step 1: Install PythonInstall the Python programming language on your computer (Windows, macOS, Linux, etc.). You can visit the official Python website and download and install it from there.
If you've already installed it, run this code to verify it:
python --version
Step 2: Install Necessary Modules and Libraries-
smtplib: This handles SMTP communications. Use the code below to import 'smtplib' and connect with your email server:
import smtplib -
email module: This provides classes (Subject, To, From, etc.) to construct and parse emails. It also facilitates email encoding and decoding with Multipurpose Internet Mail Extensions (MIME).
-
MIMEText: It's used for formatting your emails and supports sending emails with text and attachments like images, videos, etc. Import this using the code below:
import MIMEText -
MIMEMultipart: Use this library to add attachments and text sections separately in your email.
import MIMEMultipart -
ssl: It provides Secure Sockets Layer (SSL) encryption.
To send emails using the Gmail SMTP email service, I recommend creating a test account to develop the code. Delete the account once you've tested the code.
The reason is, you'll need to modify the security settings of your Gmail account to enable access from the Python code for sending emails. This might expose the login details, compromising security. In addition, it will flood your account with too many test emails.
So, instead of using your own Gmail account, create a new one for creating and testing the code. Here's how to do this:
- Create a fresh Gmail account
- Set up your app password:
Google Account > Security > Turn on 2-Step Verification > Security > Set up an App Password
Next, define a name for the app password and click on "Generate". You'll get a 16-digit password after following some instructions on the screen. Store the password safely.
Use this password while sending emails in Python. Here, we're using Gmail SMTP, but if you want to use another mail service provider, follow the same process. Alternatively, contact your company's IT team to seek support in accessing your SMTP server.
Email Authentication With PythonEmail authentication is a security mechanism that verifies the sender's identity, ensuring the emails from a domain are legitimate. If you have no email authentication mechanism in place, your emails might land in spam folders, or malicious actors can spoof or intercept them. This could affect your email delivery rates and the sender's reputation.
This is the reason you must enable Python email authentication mechanisms and protocols, such as:
-
SMTP authentication: If you're sending emails using an SMTP server like Gmail SMTP, you can use this method of authentication. It verifies the sender's authenticity when sending emails via a specific mail server.
-
SPF: Stands for Sender Policy Framework and checks whether the IP address of the sending server is among
-
DKIM: Stands for DomainKeys Identified Mail and is used to add a digital signature to emails to ensure no one can alter the email's content while it's in transmission. The receiver's server will then verify the digital signature. Thus, all your emails and their content stay secure and unaltered.
-
DMARC: Stands for Domain-based Message Authentication, Reporting, and Conformance. DMARC instructs mail servers what to do if an email fails authentication. In addition, it provides reports upon detecting any suspicious activities on your domain.
To authenticate your email in Python using SMTP, the smtplib library is useful. Here's how Python SMTP security works:
import smtplib server = smtplib.SMTP('smtp.domain1.com', 587) server.starttls() # Start TLS for secure connection server.login('my_email@domain1.com', 'my_password') message = "Subject: Test Email." server.sendmail('my_email@domain1.com', 'receiver@domain2.com', message) server.quit()Implementing email authentication will add an additional layer of security to your emails and protect them from attackers or from being marked as spam.
Encrypting Emails With PythonEncrypting emails enables you to protect your email's content so that only authorized senders and receivers can access or view the content. Encrypting emails with Python is done using encryption techniques to encode the email message and transform it into a secure and unreadable format (also known as ciphertext).
This way, email encryption secures the message from unauthorized access or attackers even if they intercept the email.
Here are different types of email encryption:
-
SSL: This stands for Secure Sockets Layer, one of the most popular and widely used encryption protocols. SSL ensures email confidentiality by encrypting data transmitted between the mail server and the client.
-
TLS: This stands for Transport Layer Security and is a common email encryption protocol today. Many consider it a great alternative to SSL. It encrypts the connection between an email client and the mail server to prevent anyone from intercepting the email during its transmission.
-
E2EE: This stands for end-to-end encryption, ensuring only the intended recipient with valid credentials can decrypt the email content and read it. It aims to prevent email interception and secure the message.
If your mail server requires SSL encryption, here's how to send an email in Python:
import smtplib import ssl context = ssl.create_default_context() server = smtplib.SMTP_SSL('smtp.domain1.com', 465, context=context) # This is for SSL connections, requiring port number 465 server.login('my_email@domain1.com', 'my_password') message = "Subject: SSL Encrypted Email." server.sendmail('my_email@domain1.com', 'receiver@domain2.com', message) server.quit()For TLS connections, you'll need the smtplib library:
import smtplib server = smtplib.SMTP('smtp.domain1.com', 587) # TLS requires 587 port number server.starttls() # Start TLS encryption server.login('my_email@domain1.com', 'my_password') message = "Subject: TLS Encrypted Email." server.sendmail('my_email@domain1.com', 'receiver@domain2.com', message) server.quit()For end-to-end encryption, you'll need more advanced libraries or tools such as GnuPG, OpenSSL, Signal Protocol, and more.
Combining Authentication and EncryptionEmail Security with Python requires both encryption and authentication. This ensures that mail servers find the email legitimate and it stays safe from cyber attackers and unauthorized access during transmission. For email encryption, you can use either SSL or TLS and combine it with SMTP authentication to establish a robust email connection.
Now that you know how to enable email encryption and authentication in your emails, let's examine some complete code examples to understand how you can send secure emails in Python using Gmail SMTP and email encryption (SSL).
Code Examples
1. Sending a Plain Text Email import smtplib from email.mime.text import MIMEText subject = "Plain Text Email" body = "This is a plain text email using Gmail SMTP and SSL." sender = "sender1@gmail.com" receivers = ["receiver1@gmail.com", "receiver2@gmail.com"] password = "my_password" def send_email(subject, body, sender, receivers, password): msg = MIMEText(body) msg['Subject'] = subject msg['From'] = sender msg['To'] = ', '.join(receivers) with smtplib.SMTP_SSL('smtp.gmail.com', 465) as smtp_server: smtp_server.login(sender, password) smtp_server.sendmail(sender, receivers, msg.as_string()) print("The plain text email is sent successfully!") send_email(subject, body, sender, receivers, password)Explanation:
- sender: This contains the sender's address.
- receivers: This contains email addresses of receiver 1 and receiver 2.
- msg: This is the content of the email.
- sendmail(): This is the SMTP object's instance method. It takes three parameters - sender, receiver, and msg and sends the message.
- with: This is a context manager that is used to properly close an SMTP connection once an email is sent.
- MIMEText: This holds only plain text.
To send an email in Python with attachments securely, you will need some additional libraries like MIMEBase and encoders. Here's the code for this case:
import smtplib from email import encoders from email.mime.base import MIMEBase from email.mime.multipart import MIMEMultipart from email.mime.text import MIMEText sender = "sender1@gmail.com" password = "my_password" receiver = "receiver1@gmail.com" subject = "Email with Attachments" body = "This is an email with attachments created in Python using Gmail SMTP and SSL." with open("attachment.txt", "rb") as attachment: part = MIMEBase("application", "octet-stream") # Adding the attachment to the email part.set_payload(attachment.read()) encoders.encode_base64(part) part.add_header( "Content-Disposition", # The header indicates that the file name is an attachment. f"attachment; filename='attachment.txt'", ) message = MIMEMultipart() message['Subject'] = subject message['From'] = sender message['To'] = receiver html_part = MIMEText(body) message.attach(html_part) # To attach the file message.attach(part) with smtplib.SMTP_SSL('smtp.gmail.com', 465) as server: server.login(sender, password) server.sendmail(sender, receiver, message.as_string())Explanation:
- MIMEMultipart: This library allows you to add text and attachments both to an email separately.
- 'rb': It represents binary mode for the attachment to be opened and the content to be read.
- MIMEBase: This object is applicable to any file type.
- Encode and Base64: The file will be encoded in Base64 for safe email sending.
To send an HTML email in Python using Gmail SMTP, you need a class - MIMEText.
Here's the full code for Python send HTML email:
import smtplib from email.mime.text import MIMEText sender = "sender1@gmail.com" password = "my_password" receiver = "receiver1@gmail.com" subject = "HTML Email in Python" body = """ <html> <body> <p>HTML email created in Python with SSL and Gmail SMTP.</p> </body> </html> """ message = MIMEText(body, 'html') # To attach the HTML content to the email message['Subject'] = subject message['From'] = sender message['To'] = receiver with smtplib.SMTP_SSL('smtp.gmail.com', 465) as server: server.login(sender, password) server.sendmail(sender, receiver, message.as_string()) Testing Your Email With Authentication and EncryptionTesting your emails before sending them to the recipients is important. It enables you to discover any issues or bugs in sending emails or with the formatting, content, etc.
Thus, always test your emails on a staging server before delivering them to your target recipients, especially when sending emails in bulk. Testing emails provide the following advantages:
- Ensures the email sending functionality is working fine
- Emails have proper formatting and no broken links or attachments
- Prevents flooding the recipient's inbox with a large number of test emails
- Enhances email deliverability and reduces spam rates
- Ensures the email and its contents stay protected from attacks and unauthorized access
To test this combined setup of sending emails in Python with authentication and encryption enabled, use an email testing server like Mailtrap Email Testing. This will capture all the SMTP traffic from the staging environment, and detect and debug your emails before sending them. It will also analyze the email content, validate CSS/HTML, and provide a spam score so you can improve your email sending.
To get started:
- Open Mailtrap Email Testing
- Go to 'My Inbox'
- Click on 'Show Credentials' to get your test credentials - login and password details
Here's the Full Code Example for Testing Your Emails:
import smtplib from socket import gaierror port = 2525 # Define the SMTP server separately smtp_server = "sandbox.smtp.mailtrap.io" login = "xyz123" # Paste your Mailtrap login details password = "abc$$" # Paste your Mailtrap password sender = "test_sender@test.com" receiver = "test_receiver@example.com" message = f"""\ Subject: Hello There! To: {receiver} From: {sender} This is a test email.""" try: with smtplib.SMTP(smtp_server, port) as server: # Use Mailtrap-generated credentials for port, server name, login, and password server.login(login, password) server.sendmail(sender, receiver, message) print('Sent') except (gaierror, ConnectionRefusedError): # In case of errors print('Unable to connect to the server.') except smtplib.SMTPServerDisconnected: print('Server connection failed!') except smtplib.SMTPException as e: print('SMTP error: ' + str(e))If there's no error, you should see this message in the receiver's inbox:
This is a test email. Best Practices for Secure Email SendingConsider the below Python email best practices for secure email sending:
-
Protect data: Take appropriate security measures to protect your sensitive data such as SMTP credentials, API keys, etc. Store them in a secure, private place like config files or environment variables, ensuring no one can access them publicly.
-
Encryption and authentication: Always use email encryption and authentication so that only authorized individuals can access your emails and their content.
For authentication, you can use advanced methods like API keys, two-factor authentication, single sign-on (SSO), etc. Similarly, use advanced encryption techniques like SSL, TLS, E2EE, etc.
-
Error handling: Manage network issues, authentication errors, and other issues by handling errors effectively using except/try blocks in your code.
-
Rate-Limiting: Maintain high email deliverability by rate-limiting the email sending functionality to prevent exceeding your service limits.
-
Validate Emails: Validate email addresses from your list and remove invalid ones to enhance email deliverability and prevent your domain from getting marked as spam. You can use an email validation tool to do this.
-
Educate: Keep your team updated with secure email practices and cybersecurity risks. Monitor your spam score and email deliverability rates, and work to improve them.
Secure email sending with Python using advanced email encryption methods like SSL, TLS, and end-to-end encryption, as well as authentication protocols and techniques such as SPF, DMARC, 2FA, and API keys.
By combining these security measures, you can protect your confidential email information, improve email deliverability, and maintain trust with your target recipients. In this way, only individuals with appropriate credentials can access it. This will help prevent unauthorized access, data breaches, and other cybersecurity attacks.
The Python Show: 47 - Python Projects of 2024
I’ve been working on lots of projects this year. Here are the ones I highlighted in this episode:
JupyterLab 101 Book on Kickstarter
A book on Textual
Armin Ronacher: Accidental Spending: A Case For an Open Source Tax?
Both last week at London tech leaders and this week at the Open Source Summit in Vienna I engaged in various discussions about pledging money to Open Source. At Sentry we have been funding our Open Source dependencies for a few years now and we're trying to encourage others to do the same.
It’s not an easy ask, of course. One quite memorable point raised was what I would call “accidental spending”. The story goes like this: an engineering team spins up a bunch of Kubernetes machines. As the fleet grows in scale some inefficiencies creep in. To troubleshoot or optimize, additional services such as load balancers, firewalls, cloud provider log services, etc. are provisioned with minimal discussion. Initially none of that was part of the plan, but ever so slightly for every computing resource, some extra stuff is paid on top creating largely hidden costs. Ideally all of that pays off (after all, hopefully by debugging quicker you reduce that downtime, by having that load balancer you can auto scale and save on unused computing resources etc.). But often, the payoff feels abstract and are hard to quantify.
I call those purchases “accidental” because they are proportional to the deployed infrastructure and largely acting like a tax on top of everything. Only after a while does the scale of that line item become apparent. On the other hand intentionally purchasing a third party system is a very intentional act. It's very deliberate, requiring conversations and more scrutiny is placed for putting a credit card into a new service. Companies providing services understand this and are positioning themselves accordingly. Their play could be to make the case that that their third party solution is better, cheaper etc.
Open Source funding could be seen through both of these lenses. Today, in many ways, pledging money to Open Source is a very intentional decision. It requires discussions, persuasion and justification. The purpose and the pay-off is not entirely clear. Companies are not used to the idea of funding Open Source and they don't have a strong model to reason about these investments. Likewise many Open Source projects themselves also don't have a good way of dealing with money and might lack the governance to handle funds effectively. After all many of these projects are run by individuals and not formal organizations.
Companies are unlikely to fund something without understanding the return on investment. One better understood idea is to turn that one “random person in Nebraska” maintaining a critical dependency into a well-organized team with good op-sec. But for that to happen, funding needs to scale from pennies to dollars, making it really worthwhile.
My colleague Chad Whitacre floated an idea: what if platforms like AWS or GitHub started splitting the check? By adding a line-item to the invoices of their customers to support Open Source finding. It would turn giving to Open Source into more of a tax like thing. That might leverage the general willingness to just pile up on things to do good things. If we all pay 3% on top of our Cloud or SaaS bills to give to Open Source this would quickly add up.
While I’m intrigued by the idea, I also have my doubts that this would work. It goes back to the problem mentioned earlier that some Open Source projects just have no governance or are not even ready to receive money. How much value you put on a dependency is also very individual. Just because an NPM package has a lot of downloads does not necessarily mean it's critical to the mission of the company. rrweb is a good example for us at Sentry. It sits at the core of our session replay product but since we we vendor a pinned fork, you would not see rrweb in your dependency tree. We also value that package more than some algorithm would be able to determine about how important that package is to us.
So the challenge with the tax — as appealing as it is — is that it might make the “purchase decision” of funding Open Source easier, but it would probably make the distribution problem much worse. Deliberate, intentional funding is key. At least for the moment.
Still, it’s worth considering. The “what if” is a powerful idea. Using a restaurant analogy, the “open-source tax” is like the mandatory VAT or health surcharge on your bill: no choice is involved. Another model could be more like the tip suggestions on a receipt offering a choice but also guidance on what’s appropriate to contribute.
The current model we propose with our upcoming Open Source Pledge is to suggest like a tip what you should give in relation to your developer work force. Take the average number of full time engineers you have over a year, multiply this by 2000. That is the amount in US dollars you should give to your Open Source dependencies.
That sounds like a significant amount! But let's put this in relation for a typical developer you employ: that's less than a fifth of what you would pay for FICA (Federal Insurance Contributions Act in the US) in the US. That's less than the communal tax you would pay in Austria. I'm sure you can think of similar payroll taxes in your country.
I believe that after step one of recognizing there is a funding problem follows an obvious step two: having a baseline funding amount that stands in relation to your business (you own or are a part of) of what the amount should be. Using the size of the development team as a metric offers an objective and quantifiable starting point. The beauty in my mind of the developer count in particular is that it's somewhat independently observable from both the outside and inside [1]. The latter is important! It creates a baseline for people within a company to start a conversation about Open Source funding.
If you have feedback on this, particular the pledge I invite you mail me or to leave a comment on the Pledge's issue tracker.
[1]There is an analogy to historical taxation here. For instance the Window Tax was taxation based on the number of Windows in a building. That made enforcement easy because you could count them from street level. The downside of taht was obviously the unintended consequences that this caused. Something to always keep in mind!Python Engineering at Microsoft: Announcing the new Python Data Science Extension Pack for VS Code
We’re thrilled to announce the launch of the new Python Data Science Extension Pack for Visual Studio Code! This powerful pack brings together some of the most popular and essential VS Code extensions, making it your one-stop shop for all things data science in Python.
What’s Inside?Our extension pack is designed to streamline your data science journey from start to finish. Whether you’re preparing data, conducting analysis, visualizing results, or building and training machine learning models, we’ve got you covered.
This Data Science extension pack currently includes four extensions:
- Python – Provides rich support for the Python language such as IntelliSense, debugging, formatting, linting, code navigation, refactoring, variable explorer, test explorer, and more.
- Jupyter – Used to create and edit Jupyter Notebooks, add and run code/markdown cells, render plots, create presentation-friendly versions of your notebook by exporting to HTML or PDF and more.
- GitHub Copilot – An AI pair programmer tool that helps you write code faster and smarter.
- Data Wrangler – A code-centric data viewing and cleaning tool to explore, visualize, and clean tabular data.
Dive into the world of data science by installing the Python Data Science Extension Pack for VS Code from the VS Code extension marketplace.
We encourage you to provide feedback and file issues. Additionally, if there are other VS Code extensions that you feel are essential to the data science workflow, please let us know by creating a ticket in our GitHub repo.
The post Announcing the new Python Data Science Extension Pack for VS Code appeared first on Python.
Real Python: Python 3.13 Preview: Free Threading and a JIT Compiler
Although the final release of Python 3.13 is scheduled for October 2024, you can download and install a preview version today to explore the new features. Notably, the introduction of free threading and a just-in-time (JIT) compiler are among the most exciting enhancements, both designed to give your code a significant performance boost.
In this tutorial, you’ll:
- Compile a custom Python build from source using Docker
- Disable the Global Interpreter Lock (GIL) in Python
- Enable the Just-In-Time (JIT) compiler for Python code
- Determine the availability of new features at runtime
- Assess the performance improvements in Python 3.13
- Make a C extension module targeting Python’s new ABI
Check out what’s new in the Python changelog for a complete list of the upcoming features and improvements. This document contains a quick summary of the release highlights as well as a detailed breakdown of the planned changes.
To download the sample code and other resources accompanying this tutorial, click the link below:
Get Your Code: Click here to download the free sample code that shows you how to work with the experimental free threading and JIT compiler in Python 3.13.
Take the Quiz: Test your knowledge with our interactive “Python 3.13: Free-Threading and a JIT Compiler” quiz. You’ll receive a score upon completion to help you track your learning progress:
Interactive Quiz
Python 3.13: Free-Threading and a JIT CompilerIn this quiz, you'll test your understanding of the new features in Python 3.13. You'll revisit how to compile a custom Python build, disable the Global Interpreter Lock (GIL), enable the Just-In-Time (JIT) compiler, and more.
Free Threading and JIT in Python 3.13: What’s the Fuss?Before going any further, it’s important to note that the majority of improvements in Python 3.13 will remain invisible to the average Joe. This includes free threading (PEP 703) and the JIT compiler (PEP 744), which have already sparked a lot of excitement in the Python community.
Keep in mind that they’re both experimental features aimed at power users, who must take extra steps to enable them at Python’s build time. None of the official channels will distribute Python 3.13 with these additional features enabled by default. This is to maintain backward compatibility and to prevent potential glitches, which should be expected.
Note: Don’t try to use Python 3.13 with the experimental features in a production environment! It may cause unexpected problems, and the Python Steering Council reserves the right to remove these features entirely from future Python releases if they prove to be unstable. Treat them as an experiment to gather real-world data.
In this section, you’ll get a birds-eye view of these experimental features so you can set the right expectations. You’ll find detailed explanations on how to enable them and evaluate their impact on Python’s performance in the remainder of this tutorial.
Free Threading Makes the GIL OptionalFree threading is an attempt to remove the Global Interpreter Lock (GIL) from CPython, which has traditionally been the biggest obstacle to achieving thread-based parallelism when performing CPU-bound tasks. In short, the GIL allows only one thread of execution to run at any given time, regardless of how many cores your CPU is equipped with. This prevents Python from leveraging the available computing power effectively.
There have been many attempts in the past to bypass the GIL in Python, each with varying levels of success. You can read about these attempts in the tutorial on bypassing the GIL. While previous attempts were made by third parties, this is the first time that the core Python development team has taken similar steps with the permission of the steering council, even if some reservations remain.
Note: Python 3.12 approached the GIL obstacle from a different angle by allowing the individual subinterpreters to have their independent GILs. This can improve Python’s concurrency by letting you run different tasks in parallel, but without the ability to share data cheaply between them due to isolated memory spaces. In Python 3.13, you’ll be able to combine subinterpreters with free threading.
The removal of the GIL would have significant implications for the Python interpreter itself and especially for the large body of third-party code that relies on it. Because free threading essentially breaks backward compatibility, the long-term plan for its implementation is as follows:
- Experimental: Free threading is introduced as an experimental feature and isn’t a part of the official Python distribution. You must make a custom Python build to disable the GIL.
- Enabled: The GIL becomes optional in the official Python distribution but remains enabled by default to allow for a transition period.
- Disabled: The GIL is disabled by default, but you can still enable it if needed for compatibility reasons.
There are no plans to completely remove the GIL from the official Python distribution at the moment, as that would cause significant disruption to legacy codebases and libraries. Note that the steps outlined above are just a proposal subject to change. Also, free threading may not pan out at all if it makes single-threaded Python run slower than without it.
Until the GIL becomes optional in the official Python distribution, which may take a few more years, the Python development team will maintain two incompatible interpreter versions. The vanilla Python build won’t support free threading, while the special free-threaded flavor will have a slightly different Application Binary Interface (ABI) tagged with the letter “t” for threading.
This means that C extension modules built for stock Python won’t be compatible with the free-threaded version and the other way around. Maintainers of those external modules will be expected to distribute two packages with each release. If you’re one of them, and you use the Python/C API, then you’ll learn how to target CPython’s new ABI in the final section of this tutorial.
JIT Compiles Python to Machine CodeAs an interpreted language, Python takes your high-level code and executes it on the fly without the need for prior compilation. This has both pros and cons. Some of the biggest advantages of interpreted languages include better portability across different hardware architectures and a quick development time due to the lack of a compilation step. At the same time, interpretation is much slower than directly executing code native to your machine.
Note: To be more precise, Python interprets bytecode instructions, an intermediate binary representation between pure Python and machine code. The Python interpreter compiles your code to bytecode when you import a module and stores the resulting bytecode in the __pycache__ folder. This doesn’t inherently make your Python scripts run faster, but loading a pre-processed bytecode can indeed speed up their startup time.
Languages like C and C++ leverage Ahead-of-Time (AOT) compilation to translate your high-level code into machine code before you ship your software. The benefit of this is faster execution since the code is already in the computer’s mother tongue. While you no longer need a separate program to interpret the code, you must compile it separately for all target platforms that you want supported. You should also handle platform-specific differences yourself.
Read the full article at https://realpython.com/python313-free-threading-jit/ »[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]