Planet Python

Subscribe to Planet Python feed
Planet Python - http://planetpython.org/
Updated: 1 hour 26 min ago

PyCharm: How to Use FastAPI for Machine Learning

Thu, 2024-09-19 04:59

This is a guest post from Cheuk Ting Ho, a data scientist who contributes to multiple open-source libraries, such as pandas, Polars, and Jupyter Notebook.

FastAPI provides a quick way to build a backend service with Python. With a few decorators, you can turn your Python function into an API application.

It is widely used by many companies including Microsoft, Uber, and Netflix. According to the Python Developers Survey, FastAPI usage has grown from 21% in 2021 to 29% in 2023. For data scientists, it’s the second most popular framework, with 31% using it.

In this blog post, we will cover the basics of FastAPI for data scientists who may want to build a quick prototype for their project. 

What is FastAPI?

FastAPI is a popular web framework for building APIs with Python, based on standard Python type hints. It is intuitive and easy to use, and it can provide a production-ready application in a short period of time. It is fully compatible with OpenAPI and JSON Schema.

Why use FastAPI for machine learning?

Most teams working on machine learning projects consist of data scientists whose domains and professions lie on the statistics side of things. They may not have experience developing software or applications to ship their machine learning projects. FastAPI enables data scientists to easily create APIs for the following projects:

Deploying prediction models

The data science team may have trained a model for the prediction of the sales demand in a warehouse. To make it useful, they have to provide an API interface so other parts of the stock management system can use this new prediction functionality.

Suggestion engines

One of the very common uses of machine learning is as a system that provides suggestions based on the users’ choices. For example, if someone puts certain products in their shopping cart, more items can be suggested to that user. Such an e-commerce system requires an API call to the suggestion engine that takes input parameters.

Dynamic dashboards and reporting systems

Sometimes, reports for data science projects need to be presented as dashboards so users can inspect the results themselves. One possible approach is to have the data model provide an API. Frontend developers can use this API to create applications that allow users to interact with the data.

Advantages of using FastAPI

Compared to other Python web frameworks, FastAPI is simple yet fully functional. Mainly using decorators and type hints, it allows you to build a web application without the complexity of building a whole ORM (object-relational mapping) model and with the flexibility of using any database, including any SQL and NoSQL databases. FastAPI also provides automatic documentation generation, support for additional information and validation for query parameters, and good async support.

Fast development

Creating API calls in FastAPI is as easy as adding decorators in the Python code. Little to no backend experience is needed for anyone who wants to turn a Python function into an application that will respond to API calls.

Fast documentation

FastAPI provides automatic interactive API documentation using Swagger UI, which is an industry standard. No extra effort is required to build clear documentation with API call examples. This creates an advantage for busy data science teams who may not have the energy and expertise to write technical specifications and documentation.

Easy testing

Writing tests is one of the most important steps in software development, but it can also be one of the most tedious, especially when the time of the data science team is valuable. Testing FastAPI is made simple thanks to Starlette and HTTPX. Most of the time no monkey patching is needed and tests are easy to write and understand.

Fast deployment

FastAPI comes with a CLI tool that can bridge development and deployment smoothly. It allows you to switch between development mode and production mode easily. Once development is completed, the code can be easily deployed using a Docker container with images that have Python prebuilt.

How to use FastAPI for a machine learning project

In this example, we will turn a classification prediction model that uses the Nearest Neighbors algorithm to predict the species of various penguins based on their bill and flipper length into a backend application. We will provide an API that takes parameters from the query parameters of a URL and gives back the prediction. This shows how a prototype can be made quickly by any data scientist with no backend development experience.

We will use a simple `KNeighborsClassifier` on the penguin data set as an example. Details of how to build the model will be omitted, but feel free to check out the relevant notebook here. In the following tutorial, we will focus on the usage of FastAPI and explain some fundamental concepts. We will be building a prototype to do so. 

1. Start a FastAPI project with PyCharm

In this blog post, we will be using PyCharm Professional 2024.1. The best way to start using FastAPI is to create a FastAPI project with PyCharm. When you click New Project in PyCharm, you will be presented with a large selection of projects to choose from. Select the FastAPI tab:



From here, you can put in the name of your project and take advantage of other options such as initializing Git and the virtual environment that you want to use.

After doing so, you will see the basic structure of a FastAPI project set up for you.



There is also a `test_main.http` file set up for you to quickly test all the endpoints.


2. Set up environment dependencies

Next, set up our environment dependency with `requirements.txt` by selecting ​​Sync Python Requirements under PyCarm’s Tool menu.



Then you can select the `requirements.txt` file to be used.



You can copy and use this `requirements.txt` file. We will be using pandas and scikit-learn for the machine learning part of the project. Also, add the `penguins.csv` file to your project directory.

3. Set up your machine learning model

Arrange your machine learning code in the `main.py` file. We will start with a script that trains our model:

import pandas as pd from sklearn.model_selection import train_test_split from sklearn import preprocessing from sklearn.neighbors import KNeighborsClassifier from sklearn.pipeline import Pipeline from sklearn.preprocessing import StandardScaler data = pd.read_csv('penguins.csv') data = data.dropna() le = preprocessing.LabelEncoder() X = data[["bill_length_mm", "flipper_length_mm"]] le.fit(data["species"]) y = le.transform(data["species"]) X_train, X_test, y_train, y_test = train_test_split(X, y, stratify=y, random_state=0) clf = Pipeline( steps=[("scaler", StandardScaler()), ("knn", KNeighborsClassifier(n_neighbors=11))] ) clf.set_params().fit(X_train, y_train)

We can place the above code after `app = FastAPI()`. All of it will be run when we start the application.



However, there is a better way to run the start-up code we used to set up our model. We will cover that in a later part of the blog post.

4. Request a response

Next we will look at how to add our model to FastAPI functionality. As a first step, we will add a response to the root of the URL and just simply return a message about our model in JSON format. Change the code in `async def root():` from “Hello world” to our message like this:

@app.get("/") async def root(): return { "Name": "Penguins Prediction", "description": "This is a penguins prediction model based on the bill length and flipper length of the bird.", }

Now, test our application. First, we will start our application, which is easy in PyCharm. Just press the arrow button () next to your project name at the top.



If you are using the default settings, your application will run on http://127.0.0.1:8000. You can double-check that by looking at the prompt from the Run window.

Once the process has started, let’s go to `test_main.http` and press the first arrow button () next to `GET`. From the HTTP Client in the Services window, you will see the response message that we put in.



The response JSON file is also saved for future inspection.

5. Request with query parameters

Next, we would like to let users make predictions by providing query parameters in the URL. Let’s add the code below after the `root` function.

@app.get("/predict/") async def predict(bill_length_mm: float = 0.0, flipper_length_mm: float = 0.0): param = { "bill_length_mm": bill_length_mm, "flipper_length_mm": flipper_length_mm } if bill_length_mm <=0.0 or flipper_length_mm <=0.0: return { "parameters": param, "error message": "Invalid input values", } else: result = clf.predict([[bill_length_mm, flipper_length_mm]]) return { "parameters": param, "result": le.inverse_transform(result)[0], }

Here we set the default value of the `bill_length_mm` and `flipper_length_mm` to be 0 if the user didn’t input a value. We also add a check to see if either of the values is 0 and return an error message instead of trying to predict which penguin the input refers to.

If the inputs are not 0, we will use the model to make a prediction and use the encoder to do an inverse transformation to get the label of the predicted target, i.e. the name of the penguin species.

This is not the only way you can verify inputs. You can also consider using Pydantic for input verification.

If you are using the same version of FastAPI as stated in `requirements.txt`, FastAPI automatically refreshes the service and applies changes on save. Now put in a new URL in `test_main.http` to test (separated from the URL before with ###):

### GET http://127.0.0.1:8000/predict/?bill_length_mm=40.3&flipper_length_mm=195 Accept: application/json

Press the arrow button () next to our new URL and see the output.



Next you can try a URL with one or both of the parameters removed to see the error message:

### GET http://127.0.0.1:8000/predict/?bill_length_mm=40.3 Accept: application/json


6. Set up a machine learning model with lifespan events

Last, let’s look at how we can set up our model with FastAPI lifespan events. The advantage of doing that is we can make sure no request will be accepted while the model is still being set up and the memory used will be cleaned up afterward. To do that, we will use an `asynccontextmanager`. Before `app = FastAPI()` we will add:

from contextlib import asynccontextmanager ml_models = {} @asynccontextmanager async def lifespan(app: FastAPI): # Set up the ML model here yield # Clean up the models and release resources ml_models.clear()

Now we will move the import of pandas and scikit-learn to be alongside the other imports. We will also move our setup code inside the `lifespan` function, setting the machine learning model and LabelEncoder inside `ml_models` like this:

from fastapi import FastAPI from contextlib import asynccontextmanager import pandas as pd from sklearn.model_selection import train_test_split from sklearn import preprocessing from sklearn.neighbors import KNeighborsClassifier from sklearn.pipeline import Pipeline from sklearn.preprocessing import StandardScaler ml_models = {} @asynccontextmanager async def lifespan(app: FastAPI): # Set up the ML model here data = pd.read_csv('penguins.csv') data = data.dropna() le = preprocessing.LabelEncoder() X = data[["bill_length_mm", "flipper_length_mm"]] le.fit(data["species"]) y = le.transform(data["species"]) X_train, X_test, y_train, y_test = train_test_split(X, y, stratify=y, random_state=0) clf = Pipeline( steps=[("scaler", StandardScaler()), ("knn", KNeighborsClassifier(n_neighbors=11))] ) clf.set_params().fit(X_train, y_train) ml_models["clf"] = clf ml_models["le"] = le yield # Clean up the models and release resources ml_models.clear()

After that we will add the `lifespan=lifespan` parameter in `app = FastAPI()`:

app = FastAPI(lifespan=lifespan)

Now save and test again. Everything should work and we should see the same result as before.

Afterthought: When to train the model?

From our example, you may wonder when the model is trained. Since `clf` is trained at the beginning, i.e. when the service is launched, you may wonder why we do not train the model every time someone makes a prediction.

We do not want the model to be trained every time someone makes a call, because it costs way more resources to re-train everything. Additionally, it may cause race conditions since our FastAPI application is working concurrently. This is especially the case if we use live data that changes all the time.

Technically, we can set up an API to collect data and re-train the model (which we will demonstrate in the next example). Other options would be to schedule a re-train at a certain time when a certain amount of new data has been collected or to let a super user upload new data and trigger the re-training.

So far, we are aiming to build a prototype that runs locally. Check out this article on deploying a FastAPI project on a cloud service for more information.

What is concurrency?

To put it simply, concurrency is like when you are cooking in the kitchen, and while waiting for the water to boil, you go ahead and chop the vegetables. Since, in the web service world, the server is talking to many terminals, and the communication between the server and the terminals is slower than most internal applications, so the server will not talk to and serve the terminals one by one. Instead, it will talk to and serve many of them at the same time while fulfilling their requests. You may want to check out this explanation in the FastAPI documentation.

In Python, this is achieved by using async code. In our FastAPI code, the use of `async def` instead of `def` is obvious evidence that FastAPI is working concurrently. There are other keywords used in Python async code, like `await` and `asyncio.get_event_loop`, but we won’t be able to cover them in this blog post. 

How to use FastAPI for an image classification project

To discover more FastAPI functionality, we will add an image classification model based on the MNIST example in Keras to our application as well (we are using the TensorFlow backend). If you installed the `requirements.txt` provided, you should have Keras and Pillow installed for image processing and building a convolutional neural network (CNN).

1. Refactoring

Before we start, let’s refactor our code. To make the code more organized, we will put the model setup for the penguins prediction in a function:

def penguins_pipeline(): data = pd.read_csv('penguins.csv') data = data.dropna() le = preprocessing.LabelEncoder() X = data[["bill_length_mm", "flipper_length_mm"]] le.fit(data["species"]) y = le.transform(data["species"]) X_train, X_test, y_train, y_test = train_test_split(X, y, stratify=y, random_state=0) clf = Pipeline( steps=[("scaler", StandardScaler()), ("knn", KNeighborsClassifier(n_neighbors=11))] ) clf.set_params().fit(X_train, y_train) return clf, le

Then we rewrite the lifespan function. With full-line code completion in PyCharm, it is very easy:

2. Set up a CNN model for MNIST prediction

In similar fashion as the penguin prediction model, we create a function for MNIST prediction (and we will store the meta parameters globally):

# MNIST model meta parameters num_classes = 10 input_shape = (28, 28, 1) batch_size = 128 epochs = 15 def mnist_pipeline(): # Load the data and split it between train and test sets (x_train, y_train), _ = keras.datasets.mnist.load_data() # Scale images to the [0, 1] range x_train = x_train.astype("float32") / 255 # Make sure images have shape (28, 28, 1) x_train = np.expand_dims(x_train, -1) # convert class vectors to binary class matrices y_train = keras.utils.to_categorical(y_train, num_classes) model = keras.Sequential( [ keras.Input(shape=input_shape), layers.Conv2D(32, kernel_size=(3, 3), activation="relu"), layers.MaxPooling2D(pool_size=(2, 2)), layers.Conv2D(64, kernel_size=(3, 3), activation="relu"), layers.MaxPooling2D(pool_size=(2, 2)), layers.Flatten(), layers.Dropout(0.5), layers.Dense(num_classes, activation="softmax"), ] ) model.compile(loss="categorical_crossentropy", optimizer="adam", metrics=["accuracy"]) model.fit(x_train, y_train, batch_size=batch_size, epochs=epochs, validation_split=0.1) return model

Then add the model setup in the lifespan function:

ml_models["cnn"] = mnist_pipeline()

Note that since this is added, every time you make changes to `main.py` and save, the model will be trained again. It can take a bit of time. So in development you may want to use a dummy model that requires no training time at all or a pre-trained model instead. After training, the CNN model will be ready to go.

3. Set up a POST endpoint for uploading an image file for prediction

To set up an endpoint that takes an upload file, we have to use UploadFile in FastAPI:

@app.post("/predict-image/") async def predicct_upload_file(file: UploadFile): img = await file.read() # process image for prediction img = Image.open(BytesIO(img)).convert('L') img = np.array(img).astype("float32") / 255 img = np.expand_dims(img, (0, -1)) # predict the result result = ml_models["cnn"].predict(img).argmax(axis=-1)[0] return {"filename": file.filename, "result": str(result)}

Please note that this is a POST endpoint (so far we have only set up GET endpoints).

Don’t forget to import `UploadFile` from `fastapi`:

from fastapi import FastAPI, UploadFile

And `Image` from Pillow. We are also using `BytesIO` from the `io` module:

from PIL import Image from io import BytesIO

To test this using the PyCharm HTTP Client with a test image file, we will make use of the `multipart/form-data` encoding. You can check out the HTTP request syntax here. This is what you will put in the `test_in.http` file:

### POST http://127.0.0.1:8000/predict-image/ HTTP/1.1 Content-Type: multipart/form-data; boundary=boundary --boundary Content-Disposition: form-data; name="file"; filename="test_img0.png" < ./test_img0.png --boundary– 4. Add an API to collect data and trigger retraining

Now, here comes the retraining. We set up a POST endpoint like above to accept a zip file which contains training images and labels. The zip file will then be processed and the training data will be prepared. After that we will fit the CNN model again:

@app.post("/upload-images/") async def retrain_upload_file(file: UploadFile): img_files = [] labels_file = None train_img = None with ZipFile(BytesIO(await file.read()), 'r') as zfile: for fname in zfile.namelist(): if fname[-4:] == '.txt' and fname[:2] != '__': labels_file = fname elif fname[-4:] == '.png': img_files.append(fname) if len(img_files) == 0: return {"error": "No training images (png files) found."} else: for fname in sorted(img_files): with zfile.open(fname) as img_file: img = img_file.read() # process image img = Image.open(BytesIO(img)).convert('L') img = np.array(img).astype("float32") / 255 img = np.expand_dims(img, (0, -1)) if train_img is None: train_img = img else: train_img = np.vstack((train_img, img)) if labels_file is None: return {"error": "No training labels file (txt file) found."} else: with zfile.open(labels_file) as labels: labels_data = labels.read() labels_data = labels_data.decode("utf-8").split() labels_data = np.array(labels_data).astype("int") labels_data = keras.utils.to_categorical(labels_data, num_classes) # retrain model ml_models["cnn"].fit(train_img, labels_data, batch_size=batch_size, epochs=epochs, validation_split=0.1) return {"message": "Model trained successfully."}

Remember to import `ZipFile`:

from zipfile import ZipFile

If we now try the endpoint with this zip file of 1000 retraining images and labels, you will see that it takes a moment for the response to come, as the training is taking a while:

POST http://127.0.0.1:8000/upload-images/ HTTP/1.1 Content-Type: multipart/form-data; boundary=boundary --boundary Content-Disposition: form-data; name="file"; filename="training_data.zip" < ./retrain_img.zip --boundary--

Imagine the zip files contain more training data or you’re retraining a more complicated model. The user would then have to wait for a long time and it would seem like things are not working for them.

5. Retrain the model with BackgroundTasks

A better way to handle retraining is, after receiving the training data, we process it and check if the data is in the right format, then give a response saying that the retraining has restarted and train the model in `BackgroundTasks`. Here is how to do it. First, we will add `BackgroundTasks` to our `upload-images` endpoint:

@app.post("/upload-image/") async def retrain_upload_file(file: UploadFile, background_tasks: BackgroundTasks): ...

Remember to import it from `fastapi`:

from fastapi import FastAPI, UploadFile, BackgroundTasks

Then, we will put the fitting of the model into the `background_tasks`:

# retrain model background_tasks.add_task( ml_models["cnn"].fit, train_img, labels_data, batch_size=batch_size, epochs=epochs, validation_split=0.1 )

Also, we will update the message in the response:

return {"message": "Data received successfully, model training has started."}

Now test the endpoint again. You will see that the response has arrived much quicker, and if you look at the Run window, you’ll see that the training is running after the response has arrived.

At this point, more functionality can be added, for example, an option to notify the user later (e.g. via email) when the training is finished or track the training progress in a dashboard when a full application is built.

Develop ML FastAPI applications with PyCharm

FastAPI provides an easy way to convert your data science project into a working application in several easy steps. It is perfect for data science teams that want to provide an application prototype for their machine learning model which can be further developed into a professional web application if needed. 

PyCharm Professional is the Python IDE that allows you to develop FastAPI applications more easily with a preconfigured project for FastAPI, coding assistance, tailored run/debug configurations, and the Endpoints tool window for managing API endpoints efficiently.

Get a free trial of PyCharm Professional

In this blog post, we showed the process of providing a simple API for a pre-trained prediction model. To learn more about FastAPI, I would suggest checking out the official FastAPI documentation. If you’re choosing between different frameworks, explore how FastAPI differs from Django.

About the author Cheuk Ting Ho

Cheuk has been a Data Scientist at various companies – a job that demands high numerical and programming skills, especially in Python. Following her passion for the tech community, Cheuk has been a Developer Advocate for three years. She also contributes to multiple open-source libraries like Hypothesis, Pytest, pandas, Polars, PyO3, Jupyter Notebook, and Django. Cheuk is currently a consultant and trainer at CMD Limes.

Categories: FLOSS Project Planets

Stack Abuse: Securing Your Email Sending With Python: Authentication and Encryption

Wed, 2024-09-18 22:29

Email encryption and authentication are modern security techniques that you can use to protect your emails and their content from unauthorized access.

Everyone, from individuals to business owners, uses emails for official communication, which may contain sensitive information. Therefore, securing emails is important, especially when cyberattacks like phishing, smishing, etc. are soaring high.

In this article, I'll discuss how to send emails in Python securely using email encryption and authentication.

Setting Up Your Python Environment

Before you start creating the code for sending emails, set up your Python environment first with the configurations and libraries you'll need.

You can send emails in Python using:

  • Simple Mail Transfer Protocol (SMTP): This application-level protocol simplifies the process since Python offers an in-built library or module (smtplib) for sending emails. It's suitable for businesses of all sizes as well as individuals to automate secure email sending in Python. We're using the Gmail SMTP service in this article.

  • An email API: You can leverage a third-party API like Mailtrap Python SDK, SendGrid, Gmail API, etc., to dispatch emails in Python. This method offers more features and high email delivery speeds, although it requires some investment.

In this tutorial, we're opting for the first choice - sending emails in Python using SMTP, facilitated by the smtplib library. This library uses the RFC 821 protocol and interacts with SMTP and mail servers to streamline email dispatch from your applications. Additionally, you should install packages to enable Python email encryption, authentication, and formatting.

Step 1: Install Python

Install the Python programming language on your computer (Windows, macOS, Linux, etc.). You can visit the official Python website and download and install it from there.

If you've already installed it, run this code to verify it:

python --version

Step 2: Install Necessary Modules and Libraries
  • smtplib: This handles SMTP communications. Use the code below to import 'smtplib' and connect with your email server:

    import smtplib
  • email module: This provides classes (Subject, To, From, etc.) to construct and parse emails. It also facilitates email encoding and decoding with Multipurpose Internet Mail Extensions (MIME).

  • MIMEText: It's used for formatting your emails and supports sending emails with text and attachments like images, videos, etc. Import this using the code below:

    import MIMEText
  • MIMEMultipart: Use this library to add attachments and text sections separately in your email.

    import MIMEMultipart
  • ssl: It provides Secure Sockets Layer (SSL) encryption.

Step 3: Create a Gmail Account

To send emails using the Gmail SMTP email service, I recommend creating a test account to develop the code. Delete the account once you've tested the code.

The reason is, you'll need to modify the security settings of your Gmail account to enable access from the Python code for sending emails. This might expose the login details, compromising security. In addition, it will flood your account with too many test emails.

So, instead of using your own Gmail account, create a new one for creating and testing the code. Here's how to do this:

  • Create a fresh Gmail account
  • Set up your app password:
    Google Account > Security > Turn on 2-Step Verification > Security > Set up an App Password
    Next, define a name for the app password and click on "Generate". You'll get a 16-digit password after following some instructions on the screen. Store the password safely.

Use this password while sending emails in Python. Here, we're using Gmail SMTP, but if you want to use another mail service provider, follow the same process. Alternatively, contact your company's IT team to seek support in accessing your SMTP server.

Email Authentication With Python

Email authentication is a security mechanism that verifies the sender's identity, ensuring the emails from a domain are legitimate. If you have no email authentication mechanism in place, your emails might land in spam folders, or malicious actors can spoof or intercept them. This could affect your email delivery rates and the sender's reputation.

This is the reason you must enable Python email authentication mechanisms and protocols, such as:

  • SMTP authentication: If you're sending emails using an SMTP server like Gmail SMTP, you can use this method of authentication. It verifies the sender's authenticity when sending emails via a specific mail server.

  • SPF: Stands for Sender Policy Framework and checks whether the IP address of the sending server is among

  • DKIM: Stands for DomainKeys Identified Mail and is used to add a digital signature to emails to ensure no one can alter the email's content while it's in transmission. The receiver's server will then verify the digital signature. Thus, all your emails and their content stay secure and unaltered.

  • DMARC: Stands for Domain-based Message Authentication, Reporting, and Conformance. DMARC instructs mail servers what to do if an email fails authentication. In addition, it provides reports upon detecting any suspicious activities on your domain.

How to Implement Email Authentication in Python

To authenticate your email in Python using SMTP, the smtplib library is useful. Here's how Python SMTP security works:

import smtplib server = smtplib.SMTP('smtp.domain1.com', 587) server.starttls() # Start TLS for secure connection server.login('my_email@domain1.com', 'my_password') message = "Subject: Test Email." server.sendmail('my_email@domain1.com', 'receiver@domain2.com', message) server.quit()

Implementing email authentication will add an additional layer of security to your emails and protect them from attackers or from being marked as spam.

Encrypting Emails With Python

Encrypting emails enables you to protect your email's content so that only authorized senders and receivers can access or view the content. Encrypting emails with Python is done using encryption techniques to encode the email message and transform it into a secure and unreadable format (also known as ciphertext).

This way, email encryption secures the message from unauthorized access or attackers even if they intercept the email.

Here are different types of email encryption:

  • SSL: This stands for Secure Sockets Layer, one of the most popular and widely used encryption protocols. SSL ensures email confidentiality by encrypting data transmitted between the mail server and the client.

  • TLS: This stands for Transport Layer Security and is a common email encryption protocol today. Many consider it a great alternative to SSL. It encrypts the connection between an email client and the mail server to prevent anyone from intercepting the email during its transmission.

  • E2EE: This stands for end-to-end encryption, ensuring only the intended recipient with valid credentials can decrypt the email content and read it. It aims to prevent email interception and secure the message.

How to Implement Email Encryption in Python

If your mail server requires SSL encryption, here's how to send an email in Python:

import smtplib import ssl context = ssl.create_default_context() server = smtplib.SMTP_SSL('smtp.domain1.com', 465, context=context) # This is for SSL connections, requiring port number 465 server.login('my_email@domain1.com', 'my_password') message = "Subject: SSL Encrypted Email." server.sendmail('my_email@domain1.com', 'receiver@domain2.com', message) server.quit()

For TLS connections, you'll need the smtplib library:

import smtplib server = smtplib.SMTP('smtp.domain1.com', 587) # TLS requires 587 port number server.starttls() # Start TLS encryption server.login('my_email@domain1.com', 'my_password') message = "Subject: TLS Encrypted Email." server.sendmail('my_email@domain1.com', 'receiver@domain2.com', message) server.quit()

For end-to-end encryption, you'll need more advanced libraries or tools such as GnuPG, OpenSSL, Signal Protocol, and more.

Combining Authentication and Encryption

Email Security with Python requires both encryption and authentication. This ensures that mail servers find the email legitimate and it stays safe from cyber attackers and unauthorized access during transmission. For email encryption, you can use either SSL or TLS and combine it with SMTP authentication to establish a robust email connection.

Now that you know how to enable email encryption and authentication in your emails, let's examine some complete code examples to understand how you can send secure emails in Python using Gmail SMTP and email encryption (SSL).

Code Examples

1. Sending a Plain Text Email import smtplib from email.mime.text import MIMEText subject = "Plain Text Email" body = "This is a plain text email using Gmail SMTP and SSL." sender = "sender1@gmail.com" receivers = ["receiver1@gmail.com", "receiver2@gmail.com"] password = "my_password" def send_email(subject, body, sender, receivers, password): msg = MIMEText(body) msg['Subject'] = subject msg['From'] = sender msg['To'] = ', '.join(receivers) with smtplib.SMTP_SSL('smtp.gmail.com', 465) as smtp_server: smtp_server.login(sender, password) smtp_server.sendmail(sender, receivers, msg.as_string()) print("The plain text email is sent successfully!") send_email(subject, body, sender, receivers, password)

Explanation:

  • sender: This contains the sender's address.
  • receivers: This contains email addresses of receiver 1 and receiver 2.
  • msg: This is the content of the email.
  • sendmail(): This is the SMTP object's instance method. It takes three parameters - sender, receiver, and msg and sends the message.
  • with: This is a context manager that is used to properly close an SMTP connection once an email is sent.
  • MIMEText: This holds only plain text.
2. Sending an Email with Attachments

To send an email in Python with attachments securely, you will need some additional libraries like MIMEBase and encoders. Here's the code for this case:

import smtplib from email import encoders from email.mime.base import MIMEBase from email.mime.multipart import MIMEMultipart from email.mime.text import MIMEText sender = "sender1@gmail.com" password = "my_password" receiver = "receiver1@gmail.com" subject = "Email with Attachments" body = "This is an email with attachments created in Python using Gmail SMTP and SSL." with open("attachment.txt", "rb") as attachment: part = MIMEBase("application", "octet-stream") # Adding the attachment to the email part.set_payload(attachment.read()) encoders.encode_base64(part) part.add_header( "Content-Disposition", # The header indicates that the file name is an attachment. f"attachment; filename='attachment.txt'", ) message = MIMEMultipart() message['Subject'] = subject message['From'] = sender message['To'] = receiver html_part = MIMEText(body) message.attach(html_part) # To attach the file message.attach(part) with smtplib.SMTP_SSL('smtp.gmail.com', 465) as server: server.login(sender, password) server.sendmail(sender, receiver, message.as_string())

Explanation:

  • MIMEMultipart: This library allows you to add text and attachments both to an email separately.
  • 'rb': It represents binary mode for the attachment to be opened and the content to be read.
  • MIMEBase: This object is applicable to any file type.
  • Encode and Base64: The file will be encoded in Base64 for safe email sending.
Sending an HTML Email in Python

To send an HTML email in Python using Gmail SMTP, you need a class - MIMEText.

Here's the full code for Python send HTML email:

import smtplib from email.mime.text import MIMEText sender = "sender1@gmail.com" password = "my_password" receiver = "receiver1@gmail.com" subject = "HTML Email in Python" body = """ <html> <body> <p>HTML email created in Python with SSL and Gmail SMTP.</p> </body> </html> """ message = MIMEText(body, 'html') # To attach the HTML content to the email message['Subject'] = subject message['From'] = sender message['To'] = receiver with smtplib.SMTP_SSL('smtp.gmail.com', 465) as server: server.login(sender, password) server.sendmail(sender, receiver, message.as_string()) Testing Your Email With Authentication and Encryption

Testing your emails before sending them to the recipients is important. It enables you to discover any issues or bugs in sending emails or with the formatting, content, etc.

Thus, always test your emails on a staging server before delivering them to your target recipients, especially when sending emails in bulk. Testing emails provide the following advantages:

  • Ensures the email sending functionality is working fine
  • Emails have proper formatting and no broken links or attachments
  • Prevents flooding the recipient's inbox with a large number of test emails
  • Enhances email deliverability and reduces spam rates
  • Ensures the email and its contents stay protected from attacks and unauthorized access

To test this combined setup of sending emails in Python with authentication and encryption enabled, use an email testing server like Mailtrap Email Testing. This will capture all the SMTP traffic from the staging environment, and detect and debug your emails before sending them. It will also analyze the email content, validate CSS/HTML, and provide a spam score so you can improve your email sending.

To get started:

  • Open Mailtrap Email Testing
  • Go to 'My Inbox'
  • Click on 'Show Credentials' to get your test credentials - login and password details

Here's the Full Code Example for Testing Your Emails:

import smtplib from socket import gaierror port = 2525 # Define the SMTP server separately smtp_server = "sandbox.smtp.mailtrap.io" login = "xyz123" # Paste your Mailtrap login details password = "abc$$" # Paste your Mailtrap password sender = "test_sender@test.com" receiver = "test_receiver@example.com" message = f"""\ Subject: Hello There! To: {receiver} From: {sender} This is a test email.""" try: with smtplib.SMTP(smtp_server, port) as server: # Use Mailtrap-generated credentials for port, server name, login, and password server.login(login, password) server.sendmail(sender, receiver, message) print('Sent') except (gaierror, ConnectionRefusedError): # In case of errors print('Unable to connect to the server.') except smtplib.SMTPServerDisconnected: print('Server connection failed!') except smtplib.SMTPException as e: print('SMTP error: ' + str(e))

If there's no error, you should see this message in the receiver's inbox:

This is a test email. Best Practices for Secure Email Sending

Consider the below Python email best practices for secure email sending:

  • Protect data: Take appropriate security measures to protect your sensitive data such as SMTP credentials, API keys, etc. Store them in a secure, private place like config files or environment variables, ensuring no one can access them publicly.

  • Encryption and authentication: Always use email encryption and authentication so that only authorized individuals can access your emails and their content.

    For authentication, you can use advanced methods like API keys, two-factor authentication, single sign-on (SSO), etc. Similarly, use advanced encryption techniques like SSL, TLS, E2EE, etc.

  • Error handling: Manage network issues, authentication errors, and other issues by handling errors effectively using except/try blocks in your code.

  • Rate-Limiting: Maintain high email deliverability by rate-limiting the email sending functionality to prevent exceeding your service limits.

  • Validate Emails: Validate email addresses from your list and remove invalid ones to enhance email deliverability and prevent your domain from getting marked as spam. You can use an email validation tool to do this.

  • Educate: Keep your team updated with secure email practices and cybersecurity risks. Monitor your spam score and email deliverability rates, and work to improve them.

Wrapping Up

Secure email sending with Python using advanced email encryption methods like SSL, TLS, and end-to-end encryption, as well as authentication protocols and techniques such as SPF, DMARC, 2FA, and API keys.

By combining these security measures, you can protect your confidential email information, improve email deliverability, and maintain trust with your target recipients. In this way, only individuals with appropriate credentials can access it. This will help prevent unauthorized access, data breaches, and other cybersecurity attacks.

Categories: FLOSS Project Planets

The Python Show: 47 - Python Projects of 2024

Wed, 2024-09-18 20:37

I’ve been working on lots of projects this year. Here are the ones I highlighted in this episode:

The Python Show is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.

Categories: FLOSS Project Planets

Armin Ronacher: Accidental Spending: A Case For an Open Source Tax?

Wed, 2024-09-18 20:00

Both last week at London tech leaders and this week at the Open Source Summit in Vienna I engaged in various discussions about pledging money to Open Source. At Sentry we have been funding our Open Source dependencies for a few years now and we're trying to encourage others to do the same.

It’s not an easy ask, of course. One quite memorable point raised was what I would call “accidental spending”. The story goes like this: an engineering team spins up a bunch of Kubernetes machines. As the fleet grows in scale some inefficiencies creep in. To troubleshoot or optimize, additional services such as load balancers, firewalls, cloud provider log services, etc. are provisioned with minimal discussion. Initially none of that was part of the plan, but ever so slightly for every computing resource, some extra stuff is paid on top creating largely hidden costs. Ideally all of that pays off (after all, hopefully by debugging quicker you reduce that downtime, by having that load balancer you can auto scale and save on unused computing resources etc.). But often, the payoff feels abstract and are hard to quantify.

I call those purchases “accidental” because they are proportional to the deployed infrastructure and largely acting like a tax on top of everything. Only after a while does the scale of that line item become apparent. On the other hand intentionally purchasing a third party system is a very intentional act. It's very deliberate, requiring conversations and more scrutiny is placed for putting a credit card into a new service. Companies providing services understand this and are positioning themselves accordingly. Their play could be to make the case that that their third party solution is better, cheaper etc.

Open Source funding could be seen through both of these lenses. Today, in many ways, pledging money to Open Source is a very intentional decision. It requires discussions, persuasion and justification. The purpose and the pay-off is not entirely clear. Companies are not used to the idea of funding Open Source and they don't have a strong model to reason about these investments. Likewise many Open Source projects themselves also don't have a good way of dealing with money and might lack the governance to handle funds effectively. After all many of these projects are run by individuals and not formal organizations.

Companies are unlikely to fund something without understanding the return on investment. One better understood idea is to turn that one “random person in Nebraska” maintaining a critical dependency into a well-organized team with good op-sec. But for that to happen, funding needs to scale from pennies to dollars, making it really worthwhile.

My colleague Chad Whitacre floated an idea: what if platforms like AWS or GitHub started splitting the check? By adding a line-item to the invoices of their customers to support Open Source finding. It would turn giving to Open Source into more of a tax like thing. That might leverage the general willingness to just pile up on things to do good things. If we all pay 3% on top of our Cloud or SaaS bills to give to Open Source this would quickly add up.

While I’m intrigued by the idea, I also have my doubts that this would work. It goes back to the problem mentioned earlier that some Open Source projects just have no governance or are not even ready to receive money. How much value you put on a dependency is also very individual. Just because an NPM package has a lot of downloads does not necessarily mean it's critical to the mission of the company. rrweb is a good example for us at Sentry. It sits at the core of our session replay product but since we we vendor a pinned fork, you would not see rrweb in your dependency tree. We also value that package more than some algorithm would be able to determine about how important that package is to us.

So the challenge with the tax — as appealing as it is — is that it might make the “purchase decision” of funding Open Source easier, but it would probably make the distribution problem much worse. Deliberate, intentional funding is key. At least for the moment.

Still, it’s worth considering. The “what if” is a powerful idea. Using a restaurant analogy, the “open-source tax” is like the mandatory VAT or health surcharge on your bill: no choice is involved. Another model could be more like the tip suggestions on a receipt offering a choice but also guidance on what’s appropriate to contribute.

The current model we propose with our upcoming Open Source Pledge is to suggest like a tip what you should give in relation to your developer work force. Take the average number of full time engineers you have over a year, multiply this by 2000. That is the amount in US dollars you should give to your Open Source dependencies.

That sounds like a significant amount! But let's put this in relation for a typical developer you employ: that's less than a fifth of what you would pay for FICA (Federal Insurance Contributions Act in the US) in the US. That's less than the communal tax you would pay in Austria. I'm sure you can think of similar payroll taxes in your country.

I believe that after step one of recognizing there is a funding problem follows an obvious step two: having a baseline funding amount that stands in relation to your business (you own or are a part of) of what the amount should be. Using the size of the development team as a metric offers an objective and quantifiable starting point. The beauty in my mind of the developer count in particular is that it's somewhat independently observable from both the outside and inside [1]. The latter is important! It creates a baseline for people within a company to start a conversation about Open Source funding.

If you have feedback on this, particular the pledge I invite you mail me or to leave a comment on the Pledge's issue tracker.

[1]There is an analogy to historical taxation here. For instance the Window Tax was taxation based on the number of Windows in a building. That made enforcement easy because you could count them from street level. The downside of taht was obviously the unintended consequences that this caused. Something to always keep in mind!
Categories: FLOSS Project Planets

Python Engineering at Microsoft: Announcing the new Python Data Science Extension Pack for VS Code

Wed, 2024-09-18 19:53

We’re thrilled to announce the launch of the new Python Data Science Extension Pack for Visual Studio Code! This powerful pack brings together some of the most popular and essential VS Code extensions, making it your one-stop shop for all things data science in Python.

What’s Inside?

Our extension pack is designed to streamline your data science journey from start to finish. Whether you’re preparing data, conducting analysis, visualizing results, or building and training machine learning models, we’ve got you covered.

This Data Science extension pack currently includes four extensions:

  • Python – Provides rich support for the Python language such as IntelliSense, debugging, formatting, linting, code navigation, refactoring, variable explorer, test explorer, and more.
  • Jupyter – Used to create and edit Jupyter Notebooks, add and run code/markdown cells, render plots, create presentation-friendly versions of your notebook by exporting to HTML or PDF and more.
  • GitHub Copilot – An AI pair programmer tool that helps you write code faster and smarter.
  • Data Wrangler – A code-centric data viewing and cleaning tool to explore, visualize, and clean tabular data.
Get started today

Dive into the world of data science by installing the Python Data Science Extension Pack for VS Code from the VS Code extension marketplace.

We encourage you to provide feedback and file issues. Additionally, if there are other VS Code extensions that you feel are essential to the data science workflow, please let us know by creating a ticket in our GitHub repo.

The post Announcing the new Python Data Science Extension Pack for VS Code appeared first on Python.

Categories: FLOSS Project Planets

Real Python: Python 3.13 Preview: Free Threading and a JIT Compiler

Wed, 2024-09-18 10:00

Although the final release of Python 3.13 is scheduled for October 2024, you can download and install a preview version today to explore the new features. Notably, the introduction of free threading and a just-in-time (JIT) compiler are among the most exciting enhancements, both designed to give your code a significant performance boost.

In this tutorial, you’ll:

  • Compile a custom Python build from source using Docker
  • Disable the Global Interpreter Lock (GIL) in Python
  • Enable the Just-In-Time (JIT) compiler for Python code
  • Determine the availability of new features at runtime
  • Assess the performance improvements in Python 3.13
  • Make a C extension module targeting Python’s new ABI

Check out what’s new in the Python changelog for a complete list of the upcoming features and improvements. This document contains a quick summary of the release highlights as well as a detailed breakdown of the planned changes.

To download the sample code and other resources accompanying this tutorial, click the link below:

Get Your Code: Click here to download the free sample code that shows you how to work with the experimental free threading and JIT compiler in Python 3.13.

Take the Quiz: Test your knowledge with our interactive “Python 3.13: Free-Threading and a JIT Compiler” quiz. You’ll receive a score upon completion to help you track your learning progress:

Interactive Quiz

Python 3.13: Free-Threading and a JIT Compiler

In this quiz, you'll test your understanding of the new features in Python 3.13. You'll revisit how to compile a custom Python build, disable the Global Interpreter Lock (GIL), enable the Just-In-Time (JIT) compiler, and more.

Free Threading and JIT in Python 3.13: What’s the Fuss?

Before going any further, it’s important to note that the majority of improvements in Python 3.13 will remain invisible to the average Joe. This includes free threading (PEP 703) and the JIT compiler (PEP 744), which have already sparked a lot of excitement in the Python community.

Keep in mind that they’re both experimental features aimed at power users, who must take extra steps to enable them at Python’s build time. None of the official channels will distribute Python 3.13 with these additional features enabled by default. This is to maintain backward compatibility and to prevent potential glitches, which should be expected.

Note: Don’t try to use Python 3.13 with the experimental features in a production environment! It may cause unexpected problems, and the Python Steering Council reserves the right to remove these features entirely from future Python releases if they prove to be unstable. Treat them as an experiment to gather real-world data.

In this section, you’ll get a birds-eye view of these experimental features so you can set the right expectations. You’ll find detailed explanations on how to enable them and evaluate their impact on Python’s performance in the remainder of this tutorial.

Free Threading Makes the GIL Optional

Free threading is an attempt to remove the Global Interpreter Lock (GIL) from CPython, which has traditionally been the biggest obstacle to achieving thread-based parallelism when performing CPU-bound tasks. In short, the GIL allows only one thread of execution to run at any given time, regardless of how many cores your CPU is equipped with. This prevents Python from leveraging the available computing power effectively.

There have been many attempts in the past to bypass the GIL in Python, each with varying levels of success. You can read about these attempts in the tutorial on bypassing the GIL. While previous attempts were made by third parties, this is the first time that the core Python development team has taken similar steps with the permission of the steering council, even if some reservations remain.

Note: Python 3.12 approached the GIL obstacle from a different angle by allowing the individual subinterpreters to have their independent GILs. This can improve Python’s concurrency by letting you run different tasks in parallel, but without the ability to share data cheaply between them due to isolated memory spaces. In Python 3.13, you’ll be able to combine subinterpreters with free threading.

The removal of the GIL would have significant implications for the Python interpreter itself and especially for the large body of third-party code that relies on it. Because free threading essentially breaks backward compatibility, the long-term plan for its implementation is as follows:

  1. Experimental: Free threading is introduced as an experimental feature and isn’t a part of the official Python distribution. You must make a custom Python build to disable the GIL.
  2. Enabled: The GIL becomes optional in the official Python distribution but remains enabled by default to allow for a transition period.
  3. Disabled: The GIL is disabled by default, but you can still enable it if needed for compatibility reasons.

There are no plans to completely remove the GIL from the official Python distribution at the moment, as that would cause significant disruption to legacy codebases and libraries. Note that the steps outlined above are just a proposal subject to change. Also, free threading may not pan out at all if it makes single-threaded Python run slower than without it.

Until the GIL becomes optional in the official Python distribution, which may take a few more years, the Python development team will maintain two incompatible interpreter versions. The vanilla Python build won’t support free threading, while the special free-threaded flavor will have a slightly different Application Binary Interface (ABI) tagged with the letter “t” for threading.

This means that C extension modules built for stock Python won’t be compatible with the free-threaded version and the other way around. Maintainers of those external modules will be expected to distribute two packages with each release. If you’re one of them, and you use the Python/C API, then you’ll learn how to target CPython’s new ABI in the final section of this tutorial.

JIT Compiles Python to Machine Code

As an interpreted language, Python takes your high-level code and executes it on the fly without the need for prior compilation. This has both pros and cons. Some of the biggest advantages of interpreted languages include better portability across different hardware architectures and a quick development time due to the lack of a compilation step. At the same time, interpretation is much slower than directly executing code native to your machine.

Note: To be more precise, Python interprets bytecode instructions, an intermediate binary representation between pure Python and machine code. The Python interpreter compiles your code to bytecode when you import a module and stores the resulting bytecode in the __pycache__ folder. This doesn’t inherently make your Python scripts run faster, but loading a pre-processed bytecode can indeed speed up their startup time.

Languages like C and C++ leverage Ahead-of-Time (AOT) compilation to translate your high-level code into machine code before you ship your software. The benefit of this is faster execution since the code is already in the computer’s mother tongue. While you no longer need a separate program to interpret the code, you must compile it separately for all target platforms that you want supported. You should also handle platform-specific differences yourself.

Read the full article at https://realpython.com/python313-free-threading-jit/ »

[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]

Categories: FLOSS Project Planets

Django Weblog: Last call for DjangoCon US 2024 tickets!

Wed, 2024-09-18 08:00

DjangoCon US starts next week in Durham, NC on September 22nd!

If you aren't able to join in person, please consider purchasing an online ticket: https://ti.to/defna/djangocon-us-2024

The conference is full of a variety of talks with excellent keynote speakers! It's shaping up to be an event you'll want to experience live.

If you'd like to learn more about DjangoCon US visit them at their website or reach out to them at hello@djangocon.us.

Categories: FLOSS Project Planets

Spyder IDE: Scientific IDE UX Birds of a Feather session at SciPy 2024

Tue, 2024-09-17 20:00
The Spyder team hosted a Birds of a Feather session at SciPy 2024, this time on the topic of users' experiences (good and bad) with the UI/UX of scientific interfaces and IDEs, and how their developers can better serve users. Here, we share what we learned from the session, as well as a link to the full detailed community notes.
Categories: FLOSS Project Planets

PyCoder’s Weekly: Issue #647 (Sept. 17, 2024)

Tue, 2024-09-17 15:30

#647 – SEPTEMBER 17, 2024
View in Browser »

How to Use Conditional Expressions With NumPy where()

This tutorial teaches you how to use the where() function to select elements from your NumPy arrays based on a condition. You’ll learn how to perform various operations on those elements and even replace them with elements from a separate array or arrays.
REAL PYTHON

PythonistR: A Match Made in Data Heaven

In data science you’ll sometimes hear a debate between R and Python. Cosima says ‘why not choose both?’ She outlines a data pipeline that uses the best tool for each job.
COSIMA MEYER

Transcribe Audio in 5 Lines of Code

Build AI apps that understand speech with insanely accurate speech-to-text models. Sign up for a free account and get $50 in credits to try AssemblyAI’s speech recognition models →
ASSEMBLY AI sponsor

Python HTTP Clients: Requests vs. HTTPX vs. AIOHTTP

Learn about the differences between Requests, HTTPX, and AIOHTTP, and when to use each library for your Python projects.
GEORGES HAIDAR

Quiz: Generate Images With DALL·E and the OpenAI API

In this quiz, you’ll test your understanding of generating images with DALL·E by OpenAI using Python. You’ll revisit concepts such as using the OpenAI Python library, making API calls for image generation, creating images from text prompts, and converting Base64 strings to PNG image files.
REAL PYTHON

Quiz: The Walrus Operator: Python’s Assignment Expressions

In this quiz, you’ll test your understanding of the Python Walrus Operator. This operator was introduced in Python 3.8, and understanding it can help you write more concise and efficient code.
REAL PYTHON

Python Releases 3.12.6, 3.11.10, 3.10.15, 3.9.20, and 3.8.20

PYTHON.ORG

Python Release Python 3.13.0rc2

PYTHON.ORG

Articles & Tutorials Python Community Divided Over CoC Enforcement

Over the last few months there has been a lot of back and forth in the Python community, especially on the forums, around changes to bylaws and how the Code of Conduct is enforced. This article covers the history and context of the events.
JAKE EDGE

Django: Rotate Your Secret Key, Fast or Slow

Django’s SECRET_KEY setting is used for cryptographic signing in various places, such as for session storage and password reset tokens. If you need to rotate it you can allow read-only use of the old key to smooth the transition.
ADAM JOHNSON

Posit Connect - Help Your Analytics Team Share and Collaborate

Tired of tediously send files and trying to use general-purpose collaboration tools? Posit Connect makes it easy to share, collaborate, and get feedback on your data science work including Jupyter notebooks, Plotly dashboards, Streamlit, Quarto, Shiny or other interactive analytics applications →
POSIT sponsor

Multiversion Python Thoughts

Armin has played around with enabling multiple versions of a library to be installed for the same instance of Python in the past, and recent feature additions to uv are making it come closer to fruition.
ARMIN RONACHER

How We Made Notebooks Load 10 Times Faster

“When we received feedback our Notebooks UI was taking too long too load, our engineers dove into ways to improve the developer experience — bringing some load times from 30 seconds down to less than one.”
LUIS NEVES

When to Use .__repr__() vs .__str__() in Python

In this video course, you’ll learn the difference between the string representations returned by .__repr__() vs .__str__() and understand how to use them effectively in classes that you define.
REAL PYTHON course

Python Bytes Episode #400

The Python Bytes podcast just delivered show #400. This is a huge accomplishment. This episode celebrates the achievement, and also covers: Python 3.13RC, Docker with uv, the humanize project, and more.
PYTHON BYTES podcast

Improved print Readability With pprint

The pretty print module (pprint) provides more readable output for complex data structures and this post shows you how to use the library and what you can get out of it.
JUHA-MATTI SANTALA

“Next Level Python” Humble Bundle for Charity

Make mastering Python your mission: This mix of online courses, books, exercises, and productivity tools is here to help you succeed—whether you’re a beginner or a skilled Python pro. Support Girls Who Code and get Python books, software, and video courses collectively valued at $1,882 for a pay-what-you-want price →
HUMBLEBUNDLE.COM sponsor

Switching From pyenv to uv

Will has recently switched from using a variety of packaging tools to just using uv. This post is a summary of what needed to change when going from pyenv to uv.
WILL KAHN-GREENE

uv Under Discussion on Mastodon

There is a deep conversation going on about the longevity of uv on Mastodon and for those not on the platform, Simon has summarized it.
SIMON WILLISON

Why Not Comments

This post talks about why you might want to include information in your code comments about why you didn’t take a particular approach.
HILLEL WAYNE

How to Build a Perfect Docker Image for a Poetry Project

This article describes how to build a secure, fast to build, and lightweight Docker image for your Poetry-based project
CODEMAGEDDON • Shared by Sergey

Python macOS Framework Builds

Glyph explains just what a Framework is on macOS and why CPython on macOS should be built that way.
GLYPH LEFKOWITZ

Projects & Code PSP (Python Scaffolding Projects)

GITHUB.COM/MATTEOGUADRINI • Shared by Matteo Guadrini

pocketpy: Portable Python 3.x Interpreter in Modern C

GITHUB.COM/POCKETPY

graphiti: Build Dynamic, Temporally-Aware Knowledge Graphs

GITHUB.COM/GETZEP

picows: Ultra-Fast Websocket Client and Server for Asyncio

GITHUB.COM/TARASKO

django-cotton: Component Based Design to Django Templates

GITHUB.COM/WRABIT

Events PyData Amsterdam 2024

September 18 to September 21, 2024
PYDATA.ORG

Weekly Real Python Office Hours Q&A (Virtual)

September 18, 2024
REALPYTHON.COM

PyCon India 2024

September 20 to September 24, 2024
PYCON.ORG

PyCon TW 2024

September 21 to September 23, 2024
PYCON.ORG

DjangoCon US 2024

September 22 to September 27, 2024
DJANGOCON.US

PyBay 2024

September 23 to September 24, 2024
PYBAY.ORG

PyCon Africa 2024

September 24 to September 29, 2024
PYCON.ORG

PyData Paris 2024

September 25 to September 27, 2024
PYDATA.ORG

PyCon JP 2024

September 27 to September 30, 2024
PYCON.JP

Python Norte 2024

September 27 to September 29, 2024
PYTHONNORTE.ORG

PyCon Niger 2024

September 28 to September 30, 2024
PYCON.ORG

Happy Pythoning!
This was PyCoder’s Weekly Issue #647.
View in Browser »

[ Subscribe to 🐍 PyCoder’s Weekly 💌 – Get the best Python news, articles, and tutorials delivered to your inbox once a week >> Click here to learn more ]

Categories: FLOSS Project Planets

Python Morsels: Understanding help() in Python

Tue, 2024-09-17 11:10

When using Python's help function, have you ever wondered what the various symbols (/, *, [, and ]) mean? Understanding those symbols will help you better understand how to use the functions and classes you're working with.

Table of contents

  1. What do all the symbols mean in help output?
  2. Multiple function signatures
  3. Pay attention to the nouns
  4. The symbols of help
  5. Default values: the = symbol
  6. Unlimited arguments: the * symbol before an argument name
  7. Keyword-only arguments: a lone * symbol
  8. Positional-only arguments: a lone / symbol
  9. Arbitrary keyword arguments: the ** symbol
  10. Square brackets: optional arguments
  11. Ellipsis (...) and other weird things
  12. The conventions of Python's help function

What do all the symbols mean in help output?

We'll cover what the * and / symbols below mean:

>>> help(sorted) Help on built-in function sorted in module builtins: sorted(iterable, /, *, key=None, reverse=False) Return a new list containing all items from the iterable in ascending order. A custom key function can be supplied to customize the sort order, and the reverse flag can be set to request the result in descending order.

We'll also talk about the different formats that help output comes in. For example, note the square brackets in [x] below and note that there are two different styles noted for calling int:

>>> help(int) Help on class int in module builtins: class int(object) | int([x]) -> integer | int(x, base=10) -> integer

We'll start by giving a name to that line which indicates how a function, method, or class is called.

Multiple function signatures

A function signature notes the …

Read the full article: https://www.pythonmorsels.com/understanding-help/
Categories: FLOSS Project Planets

Real Python: Customizing VS Code Through Color Themes

Tue, 2024-09-17 10:00

A well-designed coding environment not only enhances your focus and productivity but also makes coding sessions more enjoyable. In this Code Conversation, your instructor Philipp Ascany will guide you step-by-step through the process of finding, installing, and adjusting color themes in VS Code. You’ll explore the various options available in VS Code and learn how to make fine adjustments to create a setup that suits your personal preferences.

In this video course, you’ll:

  • Learn about Themes in VS Code
  • Find a VS Code Color Theme
  • Select a Theme
  • Install Your Theme
  • Make Additional Adjustments

By the end of the course, you’ll have a coding environment that not only looks great but also enhances your overall coding experience.

[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]

Categories: FLOSS Project Planets

Real Python: Quiz: Using Python's pip to Manage Your Projects' Dependencies

Tue, 2024-09-17 08:00

In this quiz, you’ll test your understanding of Python’s standard package manager, pip. You’ll revisit the concepts behind pip, important commands, and how to install packages.

[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]

Categories: FLOSS Project Planets

Python Bytes: #401 We must replace uWSGI with something else

Tue, 2024-09-17 04:00
<strong>Topics covered in this episode:</strong><br> <ul> <li><strong>“<a href="https://github.com/overhangio/tutor/issues/937?featured_on=pythonbytes">We must replace uwsgi by something else</a>”</strong></li> <li><strong><a href="https://pythonspeed.com/articles/intro-rust-python-extensions?utm_source=pocket_shared&featured_on=pythonbytes">Let’s build and optimize a Rust extension for Python</a></strong></li> <li><strong><a href="https://www.reversinglabs.com/blog/fake-recruiter-coding-tests-target-devs-with-malicious-python-packages?featured_on=pythonbytes">Fake recruiter coding tests target devs with malicious Python packages</a></strong></li> <li><a href="https://pyfound.blogspot.com/2024/08/ask-questions-or-tell-us-what-you-think.html?utm_source=pocket_shared&featured_on=pythonbytes"><strong>Monthly PSF Board Office Hours</strong></a></li> <li><strong>Extras</strong></li> <li><strong>Joke</strong></li> </ul><a href='https://www.youtube.com/watch?v=XKI5gtnKMus' style='font-weight: bold;'data-umami-event="Livestream-Past" data-umami-event-episode="401">Watch on YouTube</a><br> <p><strong>About the show</strong></p> <p>Sponsored by ScoutAPM: <a href="https://pythonbytes.fm/scout"><strong>pythonbytes.fm/scout</strong></a></p> <p><strong>Connect with the hosts</strong></p> <ul> <li>Michael: <a href="https://fosstodon.org/@mkennedy"><strong>@mkennedy@fosstodon.org</strong></a></li> <li>Brian: <a href="https://fosstodon.org/@brianokken"><strong>@brianokken@fosstodon.org</strong></a></li> <li>Show: <a href="https://fosstodon.org/@pythonbytes"><strong>@pythonbytes@fosstodon.org</strong></a></li> </ul> <p>Join us on YouTube at <a href="https://pythonbytes.fm/stream/live"><strong>pythonbytes.fm/live</strong></a> to be part of the audience. Usually <strong>Monday</strong> at 10am PT. Older video versions available there too.</p> <p>Finally, if you want an artisanal, hand-crafted digest of every week of the show notes in email form? Add your name and email to <a href="https://pythonbytes.fm/friends-of-the-show">our friends of the show list</a>, we'll never share it.</p> <p><strong>Michael #1:</strong> <strong>“<a href="https://github.com/overhangio/tutor/issues/937?featured_on=pythonbytes">We must replace uwsgi by something else</a>”</strong></p> <ul> <li>uWSGI is now in maintenance mode: https://uwsgi-docs.readthedocs.io/en/latest/ <ul> <li><em>The project is in maintenance mode</em> <em>(only</em> <em>bugfixes and updates for new languages apis). Do not expect quick answers on github issues and/or pull requests</em> <em>(sorry</em> <em>for that) A big thanks to all of the users and contributors since 2009.</em></li> </ul></li> <li>Reasonable options look like: <ul> <li><a href="https://github.com/emmett-framework/granian?featured_on=pythonbytes">granian</a></li> <li><a href="https://www.uvicorn.org?featured_on=pythonbytes">uvicorn</a></li> <li><a href="https://hypercorn.readthedocs.io/en/latest/index.html?featured_on=pythonbytes">hypercorn</a></li> <li><a href="https://gunicorn.org?featured_on=pythonbytes">gunicorn</a> (potentially with uvicorn workers for async)</li> </ul></li> </ul> <p><strong>Brian #2:</strong> <a href="https://pythonspeed.com/articles/intro-rust-python-extensions?utm_source=pocket_shared&featured_on=pythonbytes">Let’s build and optimize a Rust extension for Python</a></p> <ul> <li>Itamar Turner-Trauring</li> <li>Example: algorithm for approximating the number of unique values in a list</li> <li>Comparison to non-approximation <ul> <li>non-approx is faster but uses way more memory</li> </ul></li> <li>Rust version <ul> <li>Use Maturin and PyO3</li> <li>Pull in Rust dependencies (rand for random numbers)</li> </ul></li> <li>Optimization <ul> <li>link-time optimization</li> <li>faster random</li> <li>store hashes only</li> </ul></li> <li>Future optimizations <ul> <li>change algorithm maybe</li> <li>pass numpy array instead of Python list (I’d like to see that spedup)</li> </ul></li> </ul> <p><strong>Michael #3:</strong> <a href="https://www.reversinglabs.com/blog/fake-recruiter-coding-tests-target-devs-with-malicious-python-packages?featured_on=pythonbytes">Fake recruiter coding tests target devs with malicious Python packages</a></p> <ul> <li>via python weekly</li> <li>GitHub projects that have been linked to previous, targeted attacks in which developers are lured using fake job interviews.</li> <li>Attackers posing as employees of major financial services firms.</li> <li>This previously happened via other means such as NPM</li> <li>This analysis revealed that the direct parent of the detected, malicious files is a PythonPYC file, meaning that once again the team encountered malware hidden in a compiled Python file.</li> <li>“The README files tell would-be candidates to make sure the project is running successfully on their system before making modifications.”</li> <li>What can you do (according to Michael)? <ul> <li>Try out new packages in a docker container</li> <li>Work on code and projects using a VM which has snapshotting (to roll back completely after you’re done)</li> <li>Fire up <a href="https://learn.microsoft.com/en-us/azure/virtual-desktop/users/connect-windows?pivots=remote-desktop-msi&featured_on=pythonbytes">a Windows desktop in the cloud</a> for the project then destroy it</li> </ul></li> </ul> <p><strong>Brian #4:</strong> <a href="https://pyfound.blogspot.com/2024/08/ask-questions-or-tell-us-what-you-think.html?utm_source=pocket_shared&featured_on=pythonbytes"><strong>Monthly PSF Board Office Hours</strong></a></p> <ul> <li>“The Office Hours will be sessions where you can share with us how we can help your community, express your perspectives, and provide feedback for the PSF.”</li> <li>“Unless we have a dedicated topic for a session, you are not limited to talking with us about the above topics, although the discussions should be focused on Python, the PSF, and our community. If you think there’s something we can help with or we should know, we welcome you to come and talk to us!”</li> <li>Upcoming office hours <ul> <li>October 8th, 2024: 9pm UTC</li> <li>November 12th, 2024: 2pm UTC</li> <li>December 10th, 2024: 9pm UTC</li> <li>January 14th, 2025: 2pm UTC</li> <li>February 11th, 2025: 9pm UTC</li> <li>March 11th, 2025: 1pm UTC</li> <li>April 8th, 2025: 9pm UTC</li> <li>May 13th, 2025: 1pm UTC (Live from PyCon US!)</li> <li>June 10th, 2025: 9pm UTC</li> <li>July 9th, 2025: 1pm UTC</li> <li>August 12th, 2025: 9pm UTC</li> </ul></li> </ul> <p><strong>Extras</strong> </p> <p>Brian:</p> <ul> <li><a href="https://2025.pycascades.com?featured_on=pythonbytes">PyCascades CFP closes Friday, Sept 20</a> <ul> <li>PyCascades is in Portland in 2025 (Feb 8 &amp; 9)</li> </ul></li> <li><p>uv <a href="https://github.com/astral-sh/uv/pull/7263?featured_on=pythonbytes">now supports Python 3.13.0rc2</a></p> <pre><code>uv self update uv venv -p 3.13 </code></pre></li> <li><p><a href="https://github.com/astral-sh/uv/issues/7193?featured_on=pythonbytes">Free threaded is still an open issue</a></p></li> </ul> <p>Michael:</p> <ul> <li><a href="https://www.humblebundle.com/software/next-level-python-from-talk-python-and-friends-software?featured_on=pythonbytes">Big Python Humble Bundle with both of our products</a> <ul> <li>Get $1,800 worth of Python content and tools for $30 and contribute to charity</li> <li>Includes 5 <a href="https://training.talkpython.fm/courses/all?featured_on=pythonbytes">Talk Python courses</a></li> <li>Several of Brian’s and his book</li> </ul></li> <li><a href="https://djangonaut.space/comms/2024-opening-session-3/?featured_on=pythonbytes">Djangonaut Space Session 3 Applications Open!</a> <ul> <li>I interviewed <a href="https://talkpython.fm/episodes/show/451/djangonauts-ready-for-blast-off?featured_on=pythonbytes">Sarah and Tushar on Talk Python</a></li> </ul></li> <li><a href="https://alt-tab-macos.netlify.app?featured_on=pythonbytes">AltTab: Windows alt-tab on macOS</a></li> </ul> <p><strong>Joke:</strong> <a href="https://devhumor.com/media/elections-403-for-bidden?featured_on=pythonbytes">Election joke</a></p>
Categories: FLOSS Project Planets

Tryton News: Security Release for issues #13505 and #13506

Tue, 2024-09-17 02:00

Albert Cervera has found that trytond allows to execute reports for records that user has no read access and also for reports limited to a set of group that the user is not.

Impact

CVSS v3.0 Base Score: 4.3

  • Attack Vector: Network
  • Attack Complexity: Low
  • Privileges Required: Low
  • User Interaction: None
  • Scope: Unchanged
  • Confidentiality: Low
  • Integrity: None
  • Availability: None
Workaround

There is no known workaround.

Resolution

All affected users should upgrade trytond to the latest version.

Affected versions per series:

  • trytond:
    • 7.2: <= 7.2.8
    • 7.0: <= 7.0.17
    • 6.0: <= 6.0.51

Non affected versions per series:

  • trytond:
    • 7.2: >= 7.2.9
    • 7.0: >= 7.0.18
    • 6.0: >= 6.0.52
Reference Concerns?

Any security concerns should be reported on the bug-tracker at https://bugs.tryton.org/ with the confidential checkbox checked.

1 post - 1 participant

Read full topic

Categories: FLOSS Project Planets

Real Python: Using Python's pip to Manage Your Projects' Dependencies

Mon, 2024-09-16 10:00

The standard package manager for Python is pip. It allows you to install and manage packages that aren’t part of the Python standard library. If you’re looking for an introduction to pip, then you’ve come to the right place!

In this tutorial, you’ll learn how to:

  • Set up pip in your working environment
  • Fix common errors related to working with pip
  • Install and uninstall packages with pip
  • Manage projects’ dependencies using requirements files

You can do a lot with pip, but the Python community is very active and has created some neat alternatives to pip. You’ll learn about those later in this tutorial.

Get Your Cheat Sheet: Click here to download a free pip cheat sheet that summarizes the most important pip commands.

Getting Started With pip

So, what exactly does pip do? pip is a package manager for Python. That means it’s a tool that allows you to install and manage libraries and dependencies that aren’t distributed as part of the standard library. The name pip was introduced by Ian Bicking in 2008:

I’ve finished renaming pyinstall to its new name: pip. The name pip is [an] acronym and declaration: pip installs packages. (Source)

Package management is so important that Python’s installers have included pip since versions 3.4 and 2.7.9, for Python 3 and Python 2, respectively. Many Python projects use pip, which makes it an essential tool for every Pythonista.

The concept of a package manager might be familiar to you if you’re coming from another programming language. JavaScript uses npm for package management, Ruby uses gem, and the .NET platform uses NuGet. In Python, pip has become the standard package manager.

Finding pip on Your System

The Python installer gives you the option to install pip when installing Python on your system. In fact, the option to install pip with Python is checked by default, so pip should be ready for you to use after installing Python.

Note: On some Linux (Unix) systems like Ubuntu, pip comes in a separate package called python3-pip, which you need to install with sudo apt install python3-pip. It’s not installed by default with the interpreter.

You can verify that pip is available by looking for the pip3 executable on your system. Select your operating system below and use your platform-specific command accordingly:

Windows PowerShell PS> where pip3 Copied!

The where command on Windows will show you where you can find the executable of pip3. If Windows can’t find an executable named pip3, then you can also try looking for pip without the three (3) at the end.

Shell $ which pip3 Copied!

The which command on Linux systems and macOS shows you where the pip3 binary file is located.

On Windows and Unix systems, pip3 may be found in more than one location. This can happen when you have multiple Python versions installed. If you can’t find pip in any location on your system, then you may consider reinstalling pip.

Instead of running your system pip directly, you can also run it as a Python module. In the next section, you’ll learn how.

Running pip as a Module

When you run your system pip directly, the command itself doesn’t reveal which Python version pip belongs to. This unfortunately means that you could use pip to install a package into the site-packages of an old Python version without noticing. To prevent this from happening, you should run pip as a Python module:

Shell $ python -m pip Copied!

Notice that you use python -m to run pip. The -m switch tells Python to run a module as an executable of the python interpreter. This way, you can ensure that your system default Python version runs the pip command. If you want to learn more about this way of running pip, then you can read Brett Cannon’s insightful article about the advantages of using python -m pip.

Note: Depending on how you installed Python, your Python executable may have a different name than python. You’ll see python used in this tutorial, but you may have to adapt the commands to use something like py or python3 instead.

Sometimes you may want to be more explicit and limit packages to a specific project. In situations like this, you should run pip inside a virtual environment.

Using pip in a Python Virtual Environment Read the full article at https://realpython.com/what-is-pip/ »

[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]

Categories: FLOSS Project Planets

TechBeamers Python: How to Create Dynamic QR Code in Python

Mon, 2024-09-16 08:14

This tutorial guides you on how to create dynamic QR codes in Python. It involves a bit more than just generating the QR code itself. Before reading this, you must know how a QR code generator works. Steps to Create Dynamic QR Codes Dynamic QR codes require the ability to track and update the information […]

The post How to Create Dynamic QR Code in Python appeared first on TechBeamers.

Categories: FLOSS Project Planets

PyCharm: 7 Ways To Use Jupyter Notebooks inside PyCharm

Mon, 2024-09-16 06:48

Jupyter notebooks allow you to tell stories by creating and sharing data, equations, and visualizations sequentially, with a supporting narrative as you go through the notebook.

Jupyter notebooks in PyCharm Professional provide functionality above and beyond that of browser-based Jupyter notebooks, such as code completion, dynamic plots, and quick statistics, to help you explore and work with your data quickly and effectively.  

Let’s take a look at 7 ways you can use Jupyter notebooks in PyCharm to achieve your goals. They are:

  • Creating or connecting to an existing notebook
  • Importing your data
  • Getting acquainted with your data
  • Using JetBrains AI Assistant 
  • Exploring your code with PyCharm
  • Getting insights from your code
  • Sharing your insights and charts

The Jupyter notebook that we used in this demo is available on GitHub.

1. Creating or connecting to an existing notebook

You can create and work on your Jupyter notebooks locally or connect to one remotely with PyCharm. Let’s take a look at both options so you can decide for yourself.

Creating a new Jupyter notebook

To work with a Jupyter notebook locally, you need to go to the Project tool window inside PyCharm, navigate to the location where you want to add the notebook, and invoke a new file. You can do this by using either your keyboard shortcuts ⌘N (macOS) / Alt+Ins (Windows/Linux) or by right-clicking and selecting New | Jupyter Notebook.

Give your new notebook a name, and PyCharm will open it ready for you to start work. You can also drag local Jupyter notebooks into PyCharm, and the IDE will automatically recognise them for you. 

Connecting to a remote Jupyter notebook

Alternatively, you can connect to a remote Jupyter notebook by selecting Tools | Add Jupyter Connection. You can then choose to start a local Jupyter server, connect to an existing running local Jupyter server, or connect to a Jupyter server using a URL – all of these options are supported.

Now you have your Jupyter notebook, you need some data!

2. Importing your data

Data generally comes in two formats, CSV or database. Let’s look at importing data from a CSV file first.

Importing from a CSV file

Polars and pandas are the two most commonly used libraries for importing data into Jupyter notebooks. I’ll give you code for both in this section, and you can check out the documentation for both Polars and pandas and learn how Polars is different to pandas

You need to ensure your CSV is somewhere in your PyCharm project, perhaps in a folder called `data`. Then, you can invoke import pandas and subsequently use it to read the code in:

import pandas as pd df = pd.read_csv("../data/airlines.csv")

In this example, airlines.csv is the file containing the data we want to manipulate. To run this and any code cell in PyCharm, use ⇧⏎ (macOS) / Shift+Enter (Windows/Linux). You can also use the green run arrows on the toolbar at the top.

If you prefer to use Polars, you can use this code:

import polars as pl df = pl.read_csv("../data/airlines.csv") Importing from a database

If your data is in a database, as is often the case for internal projects, importing it into a Jupyter notebook will require just a few more lines of code. First, you need to set up your database connection. In this example, we’re using postgreSQL

For pandas, you need to use this code to read the data in:

import pandas as pd engine = create_engine("postgresql://jetbrains:jetbrains@localhost/demo") df = pd.read_sql(sql=text("SELECT * FROM airlines"), con=engine.connect())

And for Polars, it’s this code:

import polars as pl engine = create_engine("postgresql://jetbrains:jetbrains@localhost/demo") connection = engine.connect() query = "SELECT * FROM airlines" df = pl.read_database(query, connection) 3. Getting acquainted with your data

Now we’ve read our data in, we can take a look at the DataFrame or `df` as we will refer to it in our code. To print out the DataFrame, you only need a single line of code, regardless of which method you used to read the data in:

df DataFrames

PyCharm displays your DataFrame as a table firstly so you can explore it. You can scroll horizontally through the DataFrame and click on any column header to order the data by that column. You can click on the Show Column Statistics icon on the right-hand side and select Compact or Detailed to get some helpful statistics on each column of data.   

Dynamic charts

You can use PyCharm to get a dynamic chart of your DataFrame by clicking on the Chart View icon on the left-hand side. We’re using pandas in this example, but Polars DataFrames also have the same option. 

Click on the Show Series Settings icon (a cog) on the right-hand side to configure your plot to meet your needs:

In this view, you can hover your mouse over your data to learn more about it and easily spot outliers:

You can do all of this with Polars, too. 

4. Using JetBrains AI Assistant

JetBrains AI Assistant has several offerings that can make you more productive when you’re working with Jupyter notebooks inside PyCharm. Let’s take a closer look at how you can use JetBrains AI Assistant to explain a DataFrame, write code, and even explain errors. 

Explaining DataFrames

If you’ve got a DataFrame but are unsure where to start, you can click the purple AI icon on the right-hand side of the DataFrame and select Explain DataFrame. JetBrains AI Assistant will use its context to give you an overview of the DataFrame:

You can use the generated explanation to aid your understanding.

Writing Code 

You can also get JetBrains AI Assistant to help you write code. Perhaps you know what kind of plot you want, but you’re not 100% sure what the code should look like. Well, now you can use JetBrains AI Assistant to help you. Let’s say you want to use ‘matplotlib’ to create a chart that finds the relationship between ‘TimeMonthName’ and ‘MinutesDelayedWeather’. By specifying the column names, we’re giving more context to the request which improves the reliability of the generated code. Try it with the following prompt:

Give me code using matplotlib to create a chart which finds the relationship between ‘TimeMonthName’ and ‘MinutesDelayedWeather’ for my dataframe df

If you like the resulting code, you can use the Insert Snippet at Caret button to insert the code and then run it:

import matplotlib.pyplot as plt # Assuming your data is in a DataFrame named 'df' # Replace 'df' with the actual name of your DataFrame if different # Plotting plt.figure(figsize=(10, 6)) plt.bar(df['TimeMonthName'], df['MinutesDelayedWeather'], color='skyblue') plt.xlabel('Month') plt.ylabel('Minutes Delayed due to Weather') plt.title('Relationship between TimeMonthName and MinutesDelayedWeather') plt.xticks(rotation=45) plt.grid(axis='y', linestyle='--', alpha=0.7) plt.tight_layout() plt.show()

If you don’t want to open the AI Assistant tool window, you can use the AI cell prompt to ask your questions. For example, we can ask the same question here and get the code we need:

Explaining errors

You can also get JetBrains AI Assistant to explain errors for you. When you get an error, click Explain with AI

You can use the resulting output to further your understanding of the problem and perhaps even get some code to fix it!

5. Exploring your code

PyCharm can help you get an overview of your Jupyter notebook, complete parts of your code to save your fingers, refactor it as required, debug it, and even add integrations to help you take it to the next level.

Tips for navigating and optimizing your code

Our Jupyter notebooks can grow large quite quickly, but thankfully you can use PyCharm’s Structure view to see all your notebook’s headings by clicking ⌘7 (macOS) / Alt+7 (Windows/Linux).

Code completion

Another helpful feature that you can take advantage of when using Jupyter notebooks inside PyCharm is code completion. You get both basic and type-based code completion out of the box with PyCharm, but you can also enable Full Line Code Completion in PyCharm Professional, which uses a local AI model to provide suggestions. Lastly, JetBrains AI Assistant can also help you write code and discover new libraries and frameworks. 

Refactoring

Sometimes you need to refactor your code, and in that case, you only need to know one keyboard shortcut ⌃T (macOS) / Shift+Ctrl+Alt+T (Windows/Linux) then you can choose the refactoring you want to invoke. Pick from popular options such as Rename, Change Signature, and Introduce Variable, or lesser-known options such as Extract Method, to change your code without changing the semantics: 

As your Jupyter notebook grows, it’s likely that your import statements will also grow. Sometimes you might import a package such as polars and numpy, but forget that numpy is a transitive dependency of the Polars library and as such, we don’t need to import it separately.  

To catch these cases and keep your code tidy, you can invoke Optimize Imports ⌃⌥O (macOS) / Ctrl+Alt+O (Windows/Linux) and PyCharm will remove the ones you don’t need. 

Debugging your code

You might not have used the debugger in PyCharm yet, and that’s okay. Just know that it’s there and ready to support you when you need to better understand some behavior in your Jupyter notebook. 

Place a breakpoint on the line you’re interested in by clicking in the gutter or by using ⌘F8 (macOS) / Ctrl+F8 (Windows/Linux), and then run your code with the debugger attached with the debug icon on the top toolbar:

You can also invoke PyCharm’s debugger in your Jupyter notebook with ⌥⇧⏎ (macOS) / Shift+Alt+Enter (Windows/Linux). There are some restrictions when it comes to debugging your code in a Jupyter notebook, but please try this out for yourself and share your feedback with us. 

Adding integrations into PyCharm 

IDEs wouldn’t be complete without the integrations you need. PyCharm Professional 2024.2 brings two new integrations to your workflow: DataBricks and HuggingFace.

You can enable the integrations with both Databricks and HuggingFace by going to your Settings <kbd></kbd> (macOS) / <kbd>Ctrl+Alt+S</kbd> (Windows/Linux), selecting Plugins and searching for the plugin with the corresponding name on the Marketplace tab.

6. Getting insights from your code

When analyzing your data, there’s a difference between categorical and continuous variables. Categorical data has a finite number of discrete groups or categories, whereas continuous data is one continuous measurement. Let’s look at how we can extract different insights from both the categorical and continuous variables in our airlines dataset.

Continuous variables

We can get a sense of how continuous data is distributed by looking at measures of the average value in that data and the spread of the data around the average. In normally distributed data, we can use the mean to measure the average and the standard deviation to measure the spread. However, when data is not distributed normally, we can get more accurate information using the median and the interquartile range (this is the difference between the seventy-fifth and twenty-fifth percentiles). Let’s look at one of our continuous variables to understand the difference between these measurements.

In our dataset, we have lots of continuous variables, but we’ll work with `NumDelaysLateAircraft` to see what we can learn. Let’s use the following code to get some summary statistics for just that column:

df['NumDelaysLateAircraft'].describe()

Looking at this data, we can see that there is a big difference between the `mean` of ~789 and the ‘median’ (our fiftieth percentile, indicated by “50%” in the table below) of ~618.

This indicates a skew in our variable’s distribution, so let’s use PyCharm to explore it further. Click on the Chart View icon at the top left. Once the chart has been rendered, we’ll change the series settings represented by the cog on the right-hand side of the screen. Change your x-axis to `NumDelaysLateAircraft` and your y-axis to `NumDelaysLateAircraft`. 

Now drop down the y-axis using the little arrow and select `count`. The final step is to change the chart type to Histogram using the icons in the top-right corner:

Now that we can see the skew laid out visually, we can see that most of the time, the delays are not too excessive. However, we have a number of more extreme delays – one aircraft is an outlier on the right and it was delayed by 4,509 minutes, which is just over three days!

In statistics, the mean is very sensitive to outliers because it’s a geometric average, unlike the median, which, if you ordered all observations in your variable, would sit exactly in the middle of these values. When the mean is higher than the median, it’s because you have outliers on the right-hand side of the data, the higher side, as we had here. In such cases, the median is a better indicator of the true average delay, as you can see if you look at the histogram.

Categorical variables

Let’s take a look at how we can use code to get some insights from our categorical variables. In order to get something that’s a little more interesting than just `AirportCode`, we’ll analyze how many aircraft were delayed by weather, `NumDelaysWeather`, in the different months of the year, `TimeMonthName`.

Use this code to group `NumDelaysWeather` with `TimeMonthName`:

result = df[['TimeMonthName', 'NumDelaysWeather']].groupby('TimeMonthName').sum() result

This gives us the DataFrame again in table format, but click the Chart View icon on the left-hand side of the  PyCharm UI to see what we can learn:

This is okay, but it would be helpful to have the months ordered according to the Gregorian calendar. Let’s first create a variable for the months that we expect:

month_order = [ "January", "February", "March", "April", "May", "June", "July", "August", "September", "October", "November", "December" ]

Now we can ask PyCharm to use the order that we’ve just defined in `month_order`:

# Convert the 'TimeMonthName' column to a categorical type with the specified order df["TimeMonthName"] = pd.Categorical(df["TimeMonthName"], categories=month_order, ordered=True) # Now you can group by 'TimeMonthName' and perform sum operation, specifying observed=False result = df[['TimeMonthName', 'NumDelaysWeather']].groupby('TimeMonthName', observed=False).sum() result

We then click on the Chart View icon once more, but something’s wrong!

Are we really saying that there were no flights delayed in February? That can’t be right. Let’s check our assumption with some more code:

df['TimeMonthName'].value_counts()

Aha! Now we can see that `Febuary` has been misspelt in our data set, so the correct spelling in our variable name does not match. Let’s update the spelling in our dataset with this code:

df["TimeMonthName"] = df["TimeMonthName"].replace("Febuary", "February") df['TimeMonthName'].value_counts()

Great, that looks right. Now we should be able to re-run our earlier code and get a chart view that we can interpret:

From this view, we can see that there is a higher number of delays during the months of December, January, and February, and then again in June, July, and August. However, we have not standardized this data against the total number of flights, so there may just be more flights in those months, which would cause these results along with an increased number of delays in those summer and winter months.

7. Sharing your insights and charts

When your masterpiece is complete, you’ll probably want to export data, and you can do that in various ways with Jupyter notebooks in PyCharm. 

Exporting a DataFrame

You can export a DataFrame by clicking on the down arrow on the right-hand side:

You have lots of helpful formats to choose from, including SQL, CSV, and JSON:

Exporting charts

If you prefer to export the interactive plot, you can do that too by clicking on the Export to PNG icon on the right-hand side:

Viewing your notebook as a browser

You can view your whole Jupyter notebook at any time in a browser by clicking the icon in the top-right corner of your notebook:

Finally, if you want to export your Jupyter notebook to a Python file, 2024.2 lets you do that too! Right-click on your Jupyter notebook in the Project tool window and select Convert to Python File. Follow the instructions, and you’re done!

Summary

Using Jupyter notebooks inside PyCharm Professional provides extensive functionality, enabling you to create code faster, explore data easily, and export your projects in the formats that matter to you. 

Download PyCharm Professional to try it out for yourself! Get an extended trial today and experience the difference PyCharm Professional can make in your data science endeavors.

Use the promo code “PyCharmNotebooks” at checkout to activate your free 60-day subscription to PyCharm Professional. The free subscription is available for individual users only.

Activate your 60-day trial
Categories: FLOSS Project Planets

Zato Blog: Smart IoT integrations with Akenza and Python

Mon, 2024-09-16 04:00
Smart IoT integrations with Akenza and Python 2024-09-16, by Dariusz Suchojad Overview

The Akenza IoT platform, on its own, excels in collecting and managing data from a myriad of IoT devices. However, it is integrations with other systems, such as enterprise resource planning (ERP), customer relationship management (CRM) platforms, workflow management or environmental monitoring tools that enable a complete view of the entire organizational landscape.

Complementing Akenza's capabilities, and enabling the smooth integrations, is the versatility of Python programming. Given how flexible Python is, the language is a natural choice when looking for a bridge between Akenza and the unique requirements of an organization looking to connect its intelligent infrastructure.

This article is about combining the two, Akenza and Python. At the end of it, you will have:

  • A bi-directional connection to Akenza using Python and WebSockets
  • A Python service subscribed to and receiving events from IoT devices through Akenza
  • A Python service that will be sending data to IoT devices through Akenza

Since WebSocket connections are persistent, their usage enhances the responsiveness of IoT applications which in turn helps to exchange occurs in real-time, thus fostering a dynamic and agile integrated ecosystem.

Python and Akenza WebSocket connections

First, let's have a look at full Python code - to be discussed later.

# -*- coding: utf-8 -*- # Zato from zato.server.service import WSXAdapter # ############################################################################################### # ############################################################################################### if 0: from zato.server.generic.api.outconn.wsx.common import OnClosed, \ OnConnected, OnMessageReceived # ############################################################################################### # ############################################################################################### class DemoAkenza(WSXAdapter): # Our name name = 'demo.akenza' def on_connected(self, ctx:'OnConnected') -> 'None': self.logger.info('Akenza OnConnected -> %s', ctx) # ############################################################################################### def on_message_received(self, ctx:'OnMessageReceived') -> 'None': # Confirm what we received self.logger.info('Akenza OnMessageReceived -> %s', ctx.data) # This is an indication that we are connected .. if ctx.data['type'] == 'connected': # .. for testing purposes, use a fixed asset ID .. asset_id:'str' = 'abc123' # .. build our subscription message .. data = {'type': 'subscribe', 'subscriptions': [{'assetId': asset_id, 'topic': '*'}]} ctx.conn.send(data) else: # .. if we are here, it means that we received a message other than type "connected". self.logger.info('Akenza message (other than "connected") -> %s', ctx.data) # ############################################################################################## def on_closed(self, ctx:'OnClosed') -> 'None': self.logger.info('Akenza OnClosed -> %s', ctx) # ############################################################################################## # ##############################################################################################

Now, deploy the code to Zato and create a new outgoing WebSocket connection. Replace the API key with your own and make sure to set the data format to JSON.

Receiving messages from WebSockets

The WebSocket Python services that you author have three methods of interest, each reacting to specific events:

  • on_connected - Invoked as soon as a WebSocket connection has been opened. Note that this is a low-level event and, in the case of Akenza, it does not mean yet that you are able to send or receive messages from it.

  • on_message_received - The main method that you will be spending most time with. Invoked each time a remote WebSocket sends, or pushes, an event to your service. With Akenza, this method will be invoked each time Akenza has something to inform you about, e.g. that you subscribed to messages, that

  • on_closed - Invoked when a WebSocket has been closed. It is no longer possible to use a WebSocket once it has been closed.

Let's focus on on_message_received, which is where the majority of action takes place. It receives a single parameter of type OnMessageReceived which describes the context of the received message. That is, it is in the "ctx" that you will both the current request as well as a handle to the WebSocket connection through which you can reply to the message.

The two important attributes of the context object are:

  • ctx.data - A dictionary of data that Akenza sent to you

  • ctx.conn - The underlying WebSocket connection through which the data was sent and through you can send a response

Now, the logic from lines 30-40 is clear:

  • First, we check if Akenza confirmed that we are connected (type=='connected'). You need to check the type of a message each time Akenza sends something to you and react to it accordingly.

  • Next, because we know that we are already connected (e.g. our API key was valid) we can subscribe to events from a given IoT asset. For testing purposes, the asset ID is given directly in the source code but, in practice, this information would be read from a configuration file or database.

  • Finally, for messages of any other type we simply log their details. Naturally, a full integration would handle them per what is required in given circumstances, e.g. by transforming and pushing them to other applications or management systems.

A sample message from Akenza will look like this:

INFO - WebSocketClient - Akenza message (other than "connected") -> {'type': 'subscribed', 'replyTo': None, 'timeStamp': '2023-11-20T13:32:50.028Z', 'subscriptions': [{'assetId': 'abc123', 'topic': '*', 'tagId': None, 'valid': True}], 'message': None} How to send messages to WebSockets

An aspect not to be overlooked is communication in the other direction, that is, sending of messages to WebSockets. For instance, you may have services invoked through REST APIs, or perhaps from a scheduler, and their job will be to transform such calls into configuration commands for IoT devices.

Here is the core part of such a service, reusing the same Akenza WebSocket connection:

# -*- coding: utf-8 -*- # Zato from zato.server.service import Service # ############################################################################################## # ############################################################################################## class DemoAkenzaSend(Service): # Our name name = 'demo.akenza.send' def handle(self) -> 'None': # The connection to use conn_name = 'Akenza' # Get a connection .. with self.out.wsx[conn_name].conn.client() as client: # .. and send data through it. client.send('Hello') # ############################################################################################## # ##############################################################################################

Note that responses to the messages sent to Akenza will be received using your first service's on_message_received method - WebSockets-based messaging is inherently asynchronous and the channels are independent.

Now, we have a complete picture of real-time, IoT connectivity with Akenza and WebSockets. We are able to establish persistent, responsive connections to assets, we can subscribe to and send messages to devices, and that lets us build intelligent automation and integration architectures that make use of powerful, emerging technologies.

More resources

➤ Python API integration tutorial
What is an integration platform?
Python Integration platform as a Service (iPaaS)
What is an Enterprise Service Bus (ESB)? What is SOA?

More blog posts
Categories: FLOSS Project Planets

Django Weblog: Nominate a Djangonaut for the 2024 Malcolm Tredinnick Memorial Prize

Mon, 2024-09-16 01:01

Hello Everyone 👋 It is that time of year again when we recognize someone from our community in memory of our friend Malcolm.

Malcolm was an early core contributor to Django and had both a huge influence and impact on Django as we know it today. Besides being knowledgeable he was also especially friendly to new users and contributors. He exemplified what it means to be an amazing Open Source contributor. We still miss him to this day.

The prize

The Django Software Foundation Prizes page summarizes it nicely:

The Malcolm Tredinnick Memorial Prize is a monetary prize, awarded annually, to the person who best exemplifies the spirit of Malcolm’s work - someone who welcomes, supports, and nurtures newcomers; freely gives feedback and assistance to others, and helps to grow the community. The hope is that the recipient of the award will use the award stipend as a contribution to travel to a community event -- a DjangoCon, a PyCon, a sprint -- and continue in Malcolm’s footsteps.

Please make your nominations using our form: 2024 Malcolm Tredinnick Memorial Prize.

We will take nominations until Monday, September 30th, 2024, Anywhere on Earth, and will announce the winner(s) soon after the next DSF Board meeting in October. If you have any questions please reach out to the DSF Board at foundation@djangoproject.com.

Submit a nomination

Categories: FLOSS Project Planets

Python⇒Speed: Let's build and optimize a Rust extension for Python

Sun, 2024-09-15 20:00

If your Python code isn’t fast enough, you have many options for compiled languages to write a faster extension. In this article we’ll focus on Rust, which benefits from:

  • Modern tooling, including a package repository called crates.io, and built-in build tool (cargo).
  • Excellent Python integration and tooling. The Rust package (they’re known as “crates”) for Python support is PyO3. For packaging you can use setuptools-rust, for integration with existing setuptools projects, or for standalone extensions you can use Maturin.
  • Memory- and thread-safe, so it’s much less prone to crashes or memory corruption compared to C and C++.

In particular, we’ll:

  • Implement a small algorithm in Python.
  • Re-implement it as a Rust extension.
  • Optimize the Rust version so it runs faster.
Read more...
Categories: FLOSS Project Planets

Pages