Planet Python

Subscribe to Planet Python feed
Planet Python - http://planetpython.org/
Updated: 11 hours 28 min ago

ListenData: How to Use Gemini in Python

Mon, 2024-05-13 16:54

In this tutorial, you will learn how to use Google's Gemini AI model in Python.

Steps to Access Gemini API

Follow the steps below to access the Gemini API and then use it in python.

  1. Visit Google AI Studio website.
  2. Sign in using your Google account.
  3. Create an API key.
  4. Install the Google AI Python library for the Gemini API using the command below : pip install google-generativeai.
To read this article in full, please click hereThis post appeared first on ListenData
Categories: FLOSS Project Planets

Tryton News: Release 1.5.0 of python-sql

Mon, 2024-05-13 13:05

We are proud to announce the release of the version 1.5.0 of python-sql.

python-sql is a library to write SQL queries in a pythonic way. It is mainly developed for Tryton but it has no external dependencies and is agnostic to any framework or SQL database.

In addition to bug-fixes, this release contains the following improvements:

  • Add MERGE query
  • Support “UPSERT” with ON CONFLICT clause on INSERT query
  • Remove default escape char on LIKE and ILIKE
  • Add GROUPING SETS, CUBE, and ROLLUP clauses for GROUP BY.

python-sql is available on PyPI: python-sql 1.5.0.

2 posts - 2 participants

Read full topic

Categories: FLOSS Project Planets

Mike Driscoll: How to Annotate a Graph with Matplotlib and Python

Mon, 2024-05-13 10:37

The Matplotlib package is great for visualizing data. One of its many features is the ability to annotate points on your graph. You can use annotations to explain why a particular data point is significant or interesting.

If you haven’t used Matplotlib before, you should check out my introductory article, Matplotlib – An Intro to Creating Graphs with Python or read the official documentation.

Let’s get started!

Installing Matplotlib

If you don’t have Matplotlib on your computer, you must install it. Fortunately, you can use pip, the Python package manager utility that comes with Python.

Open up your terminal or command prompt and run the following command:

python -m pip install matplotlib

Pip will now install Matplotlib and any dependencies that Matplotlib needs to work properly. Assuming that Matplotlib installs successfully, you are good to go!

Annotating Points on a Graph

Matplotlib comes with a handy annotate()method that you can use. As with most of Matplotlib’s methods, annotate()can take quite a few different parameters.

For this example, you will be using the following parameters:

  • text – The label for the annotation
  • xy – The x/y coordinate of the point of interest
  • arrowprops – A dictionary of arrow properties
  • xytext – Where to place the text for the annotation

Now that you know what you’re doing, open up your favorite Python IDE or text editor and create a new Python file. Then enter the following code:

import matplotlib.pylab as plt import numpy as np def annotated(): fig = plt.figure(figsize=(8, 6)) numbers = list(range(10)) plt.plot(numbers, np.exp(numbers)) plt.title("Annotating an Exponential Plot using plt.annotate()") plt.xlabel("x-axis") plt.ylabel("y-axis") plt.annotate("Point 1", xy=(6, 400), arrowprops=dict(arrowstyle="->"), xytext=(4, 600)) plt.annotate("Point 2", xy=(7, 1150), arrowprops=dict(arrowstyle="->", connectionstyle="arc3,rad=-.2"), xytext=(4.5, 2000)) plt.annotate("Point 3", xy=(8, 3000), arrowprops=dict(arrowstyle="->", connectionstyle="angle,angleA=90,angleB=0"), xytext=(8.5, 2200)) plt.show() if __name__ == "__main__": annotated()

Here, you are creating a simple line graph. You want to annotate three points on the graph. The arrowprops define the arrowstyleand, in the latter two points, the connectionstyle. These properties tell Matplotlib what type of arrow to use and whether it should be connected to the text as a straight line, an arc, or a 90-degree turn.

When you run this code, you will see the following graph:

You can see how the different points are located and how the arrowprops lines are changed. You should check out the full documentation to learn all the details about the arrows and annotations.

Wrapping Up

Annotating your graph is a great way to make your plots more informative. Matplotlib allows you to add many different labels to your plots, and annotating the interesting data points is quite nice.

You should spend some time experimenting with annotations and learning all the different parameters it takes to fully understand this useful feature.

The post How to Annotate a Graph with Matplotlib and Python appeared first on Mouse Vs Python.

Categories: FLOSS Project Planets

Real Python: What Is the __pycache__ Folder in Python?

Mon, 2024-05-13 10:00

When you develop a self-contained Python script, you might not notice anything unusual about your directory structure. However, as soon as your project becomes more complex, you’ll often decide to extract parts of the functionality into additional modules or packages. That’s when you may start to see a __pycache__ folder appearing out of nowhere next to your source files in seemingly random places:

project/ │ ├── mathematics/ │ │ │ ├── __pycache__/ │ │ │ ├── arithmetic/ │ │ ├── __init__.py │ │ ├── add.py │ │ └── sub.py │ │ │ ├── geometry/ │ │ │ │ │ ├── __pycache__/ │ │ │ │ │ ├── __init__.py │ │ └── shapes.py │ │ │ └── __init__.py │ └── calculator.py

Notice that the __pycache__ folder can be present at different levels in your project’s directory tree when you have multiple subpackages nested in one another. At the same time, other packages or folders with your Python source files may not contain this mysterious cache directory.

Note: To maintain a cleaner workspace, many Python IDEs and code editors are configured out-of-the-box to hide the __pycache__ folders from you, even if those folders exist on your file system.

You may encounter a similar situation after you clone a remote Git repository with a Python project and run the underlying code. So, what causes the __pycache__ folder to appear, and for what purpose?

Get Your Code: Click here to download the free sample code that shows you how to work with the pycache folder in Python.

Take the Quiz: Test your knowledge with our interactive “What Is the __pycache__ Folder in Python?” quiz. You’ll receive a score upon completion to help you track your learning progress:

Interactive Quiz

What Is the __pycache__ Folder in Python?

In this quiz, you'll have the opportunity to test your knowledge of the __pycache__ folder, including when, where, and why Python creates these folders.

In Short: It Makes Importing Python Modules Faster

Even though Python is an interpreted programming language, its interpreter doesn’t operate directly on your Python code, which would be very slow. Instead, when you run a Python script or import a Python module, the interpreter compiles your high-level Python source code into bytecode, which is an intermediate binary representation of the code.

This bytecode enables the interpreter to skip recurring steps, such as lexing and parsing the code into an abstract syntax tree and validating its correctness every time you run the same program. As long as the underlying source code hasn’t changed, Python can reuse the intermediate representation, which is immediately ready for execution. This saves time, speeding up your script’s startup time.

Remember that while loading the compiled bytecode from __pycache__ makes Python modules import faster, it doesn’t affect their execution speed!

Bytecode vs Machine CodeShow/Hide

Why bother with bytecode at all instead of compiling the code straight to the low-level machine code? While machine code is what executes on the hardware, providing the ultimate performance, it’s not as portable or quick to produce as bytecode.

Machine code is a set of binary instructions understood by your specific CPU architecture, wrapped in a container format like EXE, ELF, or Mach-O, depending on the operating system. In contrast, bytecode provides a platform-independent abstraction layer and is typically quicker to compile.

Python uses local __pycache__ folders to store the compiled bytecode of imported modules in your project. On subsequent runs, the interpreter will try to load precompiled versions of modules from these folders, provided they’re up-to-date with the corresponding source files. Note that this caching mechanism only gets triggered for modules you import in your code rather than executing as scripts in the terminal.

In addition to this on-disk bytecode caching, Python keeps an in-memory cache of modules, which you can access through the sys.modules dictionary. It ensures that when you import the same module multiple times from different places within your program, Python will use the already imported module without needing to reload or recompile it. Both mechanisms work together to reduce the overhead of importing Python modules.

Next, you’re going to find out exactly how much faster Python loads the cached bytecode as opposed to compiling the source code on the fly when you import a module.

How Much Faster Is Loading Modules From Cache?

The caching happens behind the scenes and usually goes unnoticed since Python is quite rapid at compiling the bytecode. Besides, unless you often run short-lived Python scripts, the compilation step remains insignificant when compared to the total execution time. That said, without caching, the overhead associated with bytecode compilation could add up if you had lots of modules and imported them many times over.

To measure the difference in import time between a cached and uncached module, you can pass the -X importtime option to the python command or set the equivalent PYTHONPROFILEIMPORTTIME environment variable. When this option is enabled, Python will display a table summarizing how long it took to import each module, including the cumulative time in case a module depends on other modules.

Suppose you had a calculator.py script that imports and calls a utility function from a local arithmetic.py module:

Python calculator.py from arithmetic import add add(3, 4) Copied!

The imported module defines a single function:

Python arithmetic.py def add(a, b): return a + b Copied!

As you can see, the main script delegates the addition of two numbers, three and four, to the add() function imported from the arithmetic module.

Note: Even though you use the from ... import syntax, which only brings the specified symbol into your current namespace, Python reads and compiles the entire module anyway. Moreover, unused imports would also trigger the compilation.

The first time you run your script, Python compiles and saves the bytecode of the module you imported into a local __pycache__ folder. If such a folder doesn’t already exist, then Python automatically creates one before moving on. Now, when you execute your script again, Python should find and load the cached bytecode as long as you didn’t alter the associated source code.

Read the full article at https://realpython.com/python-pycache/ »

[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]

Categories: FLOSS Project Planets

Zato Blog: IMAP and OAuth2 Integrations with Microsoft 365

Mon, 2024-05-13 04:00
IMAP and OAuth2 Integrations with Microsoft 365 2024-05-13, by Dariusz Suchojad

Overview

This is the first in a series of articles about automation of and integrations with Microsoft 365 cloud products using Python and Zato.

We start off with IMAP automation by showing how to create a scheduled Python service that periodically pulls latest emails from Outlook using OAuth2-based connections.

IMAP and OAuth2

Microsoft 365 requires for all IMAP connections to use OAuth2. This can be challenging to configure in server-side automation and orchestration processes so Zato offers an easy way that lets you read and send emails without a need for getting into low-level OAuth2 details.

Consider a common orchestration scenario - a business partner sends automated emails with attachments that need to be parsed, some information needs to be extracted and processed accordingly.

Before OAuth2, an automation process would receive from Azure administrators a dedicated IMAP account with a username and password.

Now, however, in addition to creating an IMAP account, administrators will need to create and configure a few more resources that the orchestration service will use. Note that the password to the IMAP account will never be used.

Administrators need to:

  • Register an Azure client app representing your service that uses IMAP
  • Grant this app a couple of Microsoft Graph application permissions:
  • Mail.ReadWrite
  • Mail.Send

Next, administrators need to give you a few pieces of information about the app:

  • Application (client) ID
  • Tenant (directory) ID
  • Client secret

Additionally, you still need to receive the IMAP username (an e-mail address). It is just that you do not need its corresponding password.

In Dashboard

The first step is to create a new connection in your Zato Dashboard - this will establish an OAuth2-using connection that Zato will manage and your Python code will not have to do anything else, all the underlying OAuth2 tokens will keep refreshing as needed, the platform will take care of everything.

Having received the configuration details from Azure administrators, you can open your Zato Dashboard and navigate to IMAP connections:

Fill out the form as below, choosing "Microsoft 365" as the server type. The other type, "Generic IMAP" is used for the classical case of IMAP with a username and password:

Change the secret and click Ping to confirm that the connection is configured correctly:

In Python

Use the code below to receive emails. Note that it merely needs to refer to a connection definition by its name and there is no need for any usage of OAuth2 here:

# -*- coding: utf-8 -*- # Zato from zato.server.service import Service class MyService(Service): def handle(self): # Connect to a Microsoft 365 IMAP connection by its name .. conn = self.email.imap.get('My Automation').conn # .. get all messages matching filter criteria ("unread" by default).. for msg_id, msg in conn.get(): # .. and access each of them. self.logger.info(msg.data)

This is everything that is needed for integrations with IMAP using Microsoft 365 although we can still go further. For instance, to create a scheduled job to periodically invoke the service, go to the Scheduler job in Dashboard:

In this case, we decide to have a job that runs once per hour:

As expected, clicking OK will suffice for the job to start in background. It is as simple as that.

Next steps More blog posts
Categories: FLOSS Project Planets

Pythonicity: Packaging rundown

Sat, 2024-05-11 20:00
Companion guide to the Python packaging tutorial.

This is not an overview of packaging, nor a history of the tooling. The intended audience is an author of a simple package who merely wants to publish it on the package index, without being forced to make uninformed choices.

Build backends

The crux of the poor user experience is choosing a build backend. The reader at this stage does not know what a “build backend” is, and moreover does not care.

The 4 backends in the tutorial are described here in their presented order. An example snippet of a pyproject.toml file is included, mostly assuming defaults, with a couple common options:

hatchling

- 83 kB with 4 dependencies

requires = ["hatchling"] build-backend = "hatchling.build" [tool.hatch.build.targets.sdist] include = ["<package>/*"] [tool.hatch.version] path = "<package>/__init__.py"

Part of - not to be confused with - the project manager Hatch.

The source distribution section is included because by default hatchling ostensibly includes all files that are not ignored. However, it only abides by the root .gitignore. It will include virtual environments, if not named .venv. For a project that advocates sensible defaults, this is surprising behavior and a security flaw. Even if the issue is fixed, it will presumably include untracked files and clearly omissible directories such as .github.

setuptools

- 894 kB with 0 dependencies

[build-system] requires = ["setuptools>=61.0"] build-backend = "setuptools.build_meta" [tool.setuptools] packages = ["<package>"] [tool.setuptools.dynamic] version = {attr = "<package>.__version__"} [tool.setuptools.package-data] <package> = ["py.typed"]

The original build tool, and previously the de facto standard. It is no longer commonly included in Python distributions, so they are all on equal footing with respect to needing installation.

Setuptools requires explicitly specifying the package, as well as any package data. It also includes legacy “.egg” and “setup.cfg” files, which a modern user will not be familiar with.

flit-core

- 63 kB with 0 dependencies

[build-system] requires = ["flit-core>=3.4"] build-backend = "flit_core.buildapi"

Part of the Flit tool for publishing packages.

Flit automatically supports dynamic versions (and descriptions), and includes the source directory with data files in the source distribution.

pdm-backend

- 101 kB with 0 dependencies

requires = ["pdm-backend"] build-backend = "pdm.backend" [tool.pdm] version = {source = "file", path = "<package>/__init__.py"}

Part of - not to be confused with - the project manager PDM.

PDM automatically includes the source and test directories, with data files, in the source distribution.

Evaluations Popularity and endorsements

The popularity of Setuptools should be discounted because of its history. The popularity of Hatchling and PDM-backend is clearly influenced by their respective parent projects. PDM has significantly less downloads than the others, but they are all popular enough to expect longevity.

Setuptools, Hatch, and Flit are all under the packaging authority umbrella, though as the previously cited article points out, PyPA affiliation does not indicate much.

The tutorial “defaults” to Hatchling, which presumably is not intended as an endorsement, but will no doubt be interpreted as such.

Size and dependencies

Setuptools is by far the largest; no surprise since it is much more than a build backend. Hatchling is the only one with dependencies, but the 3 modern ones seem appropriately lightweight.

File selection

Wheels have a standard layout, but source distributions do not. Whether sdist should include docs and tests is a matter of debate.

There was a time when open source software meant “distributed with an open source license”, so the source distribution was the primary way to acquire the code. This all seems anachronistic in the age of distributed version control and public collaboration. Not to mention wheels are zip files which have the source code.

One piece of advice is that the sdist should be buildable. Generated portable files could be included, thereby not needing the tools that generate them. But for a simple (read pure) Python project, that is not particularly relevant.

There is another issue with backends creating different artifacts when using their own build commands. This rundown only evaluated python -m build.

Metadata

The modern 3 implicitly support data files. All 4 support dynamic versioning in some manner. Then again, maybe the __version__ attribute is no longer the leading convention among the 7 options for single-sourcing the version. Now that importlib.metadata is no longer provisional, is that preferred?

Recommendations

It would be disingenuous to not end with recommendations, since the refusal to - in a document titled tool recommendations - is the problem. The PyPA endorses pip, build, and twine as standard tools, even though there are alternatives.

Author’s disclosures: I am a long-time Python developer of several packages, and a couple with extension modules. I use no project management tools, and am not affiliated with any of these projects.

  1. flit-core - No criticisms. The dynamic version and description feature are a plus; not having any flit-specific sections feels like less coupling.
  2. pdm-backend - No criticisms. A natural choice if one wants tests in the source distribution.
  3. hatchling - The file selection issue is significant. Users need a warning that they should include an sdist section and check their tarballs. Many are going to have unnecessarily large distributions, and someone with a local secrets directory - whether ignored or untracked - is going to have a seriously bad day.
  4. setuptools - Requires the most customization and is perpetually handicapped by backwards compatibility. The only advantage setuptools had was being already installed. It may be time to disavow it for new projects without extension modules.

My projects currently use setuptools for purely historically reasons. For new projects, I would likely use flit-core. I may switch-over existing projects, though there is really no incentive to.

Unless a standard emerges, of course.

Epilogue

A meta case could be made for Flit(-core) as well: that its limited scope and independence from a project manager is itself an asset. Whereas choosing Hatch(ling) or PDM(-backend) feels like picking a side. Flit can position itself as the minimalist choice for those who resent having to choose.

Categories: FLOSS Project Planets

Mike Driscoll: Ruff – The Fastest Python Linter and Formatter Just Got Faster!

Sat, 2024-05-11 10:16

I’m a little late in reporting on this topic, but Ruff put out an update in April 2024 that includes a hand-written recursive descent parser. This update is in version 0.4.0 and newer.

Ruff’s new parser is >2x faster, translating to a 20-40% speedup for all linting and formatting invocations. Ruff’s announcement includes some statistics to show improvements that are worth checking out.

What’s This New Parser?

I’ve never tried writing a code parser, so I’ll have to rely on Ruff’s announcement to explain this. Basically, when you are doing static analysis, you will turn the source code into Abstract Syntax Trees (ASTs), which you can then analyze. Python has an AST module built in for this purpose. Ruff is written in Rust, though, so their AST analyzer is also written in Rust.

The original parser was called a generated parser, specifically LALRPOP. The parser requires a grammar to be defined in a Domain Specific Language (DSL), which is then converted into executable code for the generator.

Ruff’s new hand-written parser is a recursive descent parser. Follow that link to Wikipedia to learn all the nitty gritty details.

Their team created a hand-written parser to give them more control and flexibility over the parsing process, making it easier to work on the many weird edge cases they need to support. They also created a new parser to make Ruff faster and provide better error messages and error resilience.

Wrapping Up

Ruff is great and makes linting and formatting your Python code so much faster. You can learn much more about Ruff in my other articles on this topic:

The post Ruff – The Fastest Python Linter and Formatter Just Got Faster! appeared first on Mouse Vs Python.

Categories: FLOSS Project Planets

Go Deh: Predicting results from small samples.

Fri, 2024-05-10 16:09

 

I've run simulations, tens of thousands of them at a time, over and over as we developed chips. In one project I noticed that I could predict the final result after only a small number of results were in which allowed me to halt the rest of the simulations, or make advance preparations for the final result.

I looked it up at the time and, indeed, there is an equation where if you want the pass rate of a "large" population to within a given accuracy, it will give you the minimum sample size to use.

To some, that's all gobledygook so I'll try to explain with some code.

Explanation

Lets say you have a large randomised population of pass/fails or ones and zeroes:

from random import sample
population_size = 100_000    # must be large, > 65K?sample_size = 123           # Actualp_hat = 0.5                  # Population pass rate, 0.5 == 50%
_ones = int(population_size*p_hat)_zeroes = population_size - _onespopulation = [1] * _ones + [0] * _zeroes

And that we take a sample from it and compute the pass rate of that single, smaller sample

def random_sample() -> list[int]:    return sample(population, k=sample_size)
pass_rate = (sum(random_sample())  # how many ones             / sample_size)      # convert to a pass rate
print(pass_rate)  # e.g. 0.59027552674230146

Every time we run that pass_rate expression we get a different value. It is random. We need to run that pass_rate calculation many times to get an idea of what the likely pass_rate would be when using the given sample size:

runs_per_sample = 10_000     # each sample size is run this many times ~2500
pass_rates = [sum(random_sample()) / sample_size              for _ in range(runs_per_sample)]

I have learnt that the question to ask is not how many times the sample pass rate was the population pass rate, but to define an acceptable margin of error, (say 5%) and say we want to be confident that pass rates be within that margin a certain percentage of the time

epsilon = 0.05               # target margin of error in sample pass rate. 0.05 == +/-5%
p_hat_min, p_hat_max = p_hat * (1 - epsilon), p_hat * (1 + epsilon)in_range_count = sum(p_hat_min <= pass_rate < p_hat_max                     for pass_rate in pass_rates)sample_confidence_level = in_range_count / runs_per_sampleprint(f"{sample_confidence_level = }")  # = 0.4054

So for a sample size of 123 we could expect the pass rate of the sample to be within 5% of the actual pass rate of the population, 0.5 or 50%, onlr o.4 or 40% of the time!

We need more!

What is actually done is we state what we think the population pass rate is, p_hat, (choose closer to 50% if you are unsure); the margin of error around p_hat we want, epsilon, usually +/-5% or +/-3%; and the confidence_level in hitting within the pass_rates margin of error.

There are calculators that will then give you n, the size of the sample needed to satisfy those condition.

Doing it myself

 I calculated for one specific sample size, above. Obviousely, if I calculated pass_rates over a range of sample_sizes, and increqasing runs_per_sample, I could search out the sample size needed.

That is done in my next program. I have to switch to using the numpy library for its speed and sample_size becomes a range.

When the pass rate confidence levels are calculated I end up with a list of confidence levels for increasing sample sizes that are usually not increasing due to the randomness, e.g.

range_hits = [...0.94, 0.95, 0.93, 0.954, ... 0.95, 0.96, 0.95, 0.96, 0.96, 0.97, ...]  # confidence levels

The range of sample_size corresponding to the first occurrence>= the requested confidence level, and the last occorrence of a confidence level < the requested confidence level in then slightly widened and the runs_per_sample increased on another iteration to get a better result.

Here's a sample of the output I get when searching

Sample output$ time python3 sample_search.py 2098 <= n <= 2610  For p_hat, epsilon, confidence_level =(0.5, 0.05, 0.95)  Using population_size, sample_size, runs_per_sample =(100000, range(50, 5000, 512), 250)
2013 <= n <= 2525  For p_hat, epsilon, confidence_level =(0.5, 0.05, 0.95)  Using population_size, sample_size, runs_per_sample =(100000, range(2013, 2694, 256), 500)
1501 <= n <= 2781  For p_hat, epsilon, confidence_level =(0.5, 0.05, 0.95)  Using population_size, sample_size, runs_per_sample =(100000, range(1501, 3037, 256), 500)
1757 <= n <= 2013  For p_hat, epsilon, confidence_level =(0.5, 0.05, 0.95)  Using population_size, sample_size, runs_per_sample =(100000, range(221, 4061, 256), 500)
1714 <= n <= 1970  For p_hat, epsilon, confidence_level =(0.5, 0.05, 0.95)  Using population_size, sample_size, runs_per_sample =(100000, range(1714, 2055, 128), 1000)
1458 <= n <= 2098  For p_hat, epsilon, confidence_level =(0.5, 0.05, 0.95)  Using population_size, sample_size, runs_per_sample =(100000, range(1458, 2226, 128), 1000)
1586 <= n <= 1714  For p_hat, epsilon, confidence_level =(0.5, 0.05, 0.95)  Using population_size, sample_size, runs_per_sample =(100000, range(818, 2738, 128), 1000)
1564 <= n <= 1692  For p_hat, epsilon, confidence_level =(0.5, 0.05, 0.95)  Using population_size, sample_size, runs_per_sample =(100000, range(1564, 1735, 64), 2000)
1500 <= n <= 1564  For p_hat, epsilon, confidence_level =(0.5, 0.05, 0.95)  Using population_size, sample_size, runs_per_sample =(100000, range(1436, 1820, 64), 2000)
1553 <= n <= 1585  For p_hat, epsilon, confidence_level =(0.5, 0.05, 0.95)  Using population_size, sample_size, runs_per_sample =(100000, range(1489, 1575, 32), 4000)
1547 <= n <= 1579  For p_hat, epsilon, confidence_level =(0.5, 0.05, 0.95)  Using population_size, sample_size, runs_per_sample =(100000, range(1547, 1590, 16), 8000)
1547 <= n <= 1579  For p_hat, epsilon, confidence_level =(0.5, 0.05, 0.95)  Using population_size, sample_size, runs_per_sample =(100000, range(1515, 1611, 16), 8000)
1541 <= n <= 1581  For p_hat, epsilon, confidence_level =(0.5, 0.05, 0.95)  Using population_size, sample_size, runs_per_sample =(100000, range(1541, 1584, 8), 16000)
1501 <= n <= 1533  For p_hat, epsilon, confidence_level =(0.5, 0.05, 0.95)  Using population_size, sample_size, runs_per_sample =(100000, range(1501, 1621, 8), 16000)
1501 <= n <= 1533  For p_hat, epsilon, confidence_level =(0.5, 0.05, 0.95)  Using population_size, sample_size, runs_per_sample =(100000, range(1389, 1538, 8), 16000)
1503 <= n <= 1575  For p_hat, epsilon, confidence_level =(0.5, 0.05, 0.95)  Using population_size, sample_size, runs_per_sample =(100000, range(1495, 1677, 8), 16000)
1503 <= n <= 1535  For p_hat, epsilon, confidence_level =(0.5, 0.05, 0.95)  Using population_size, sample_size, runs_per_sample =(100000, range(1491, 1587, 4), 32000)
 Use a sample n = 1535 to predict population pass rate of 50.0% +/-5% with a confidence level of 95%.
real    3m49.023suser    3m49.022ssys     0m1.361s


My Code# -*- coding: utf-8 -*-"""Created on Wed May  8 14:04:17 2024
@author: paddy"""# %%from random import sampleimport numpy as np

def sample_search(population_size, sample_size, p_hat, epsilon, confidence_level, runs_per_sample) -> range:    """    Arguments with example values:        population_size = 100_000           # must be large, > 65K?        sample_size = range(1400, 1750, 16) # (+min, +max, +step)        p_hat = 0.5                         # Population pass rate, 0.5 == 50%        epsilon = 0.05                      # target margin of error in sample pass rate. 0.05 == +/-5%        confidence_level = 0.95             # sample to be within p_hat +/- epsilon, 0.95 == 95% of the time.        runs_per_sample = 10_000            # each sample size is run this many times ~2500        Return:        min,max range for the sample size, n, satisfying inputs.        """        def create_1_0_array(population_size=100_000, p_hat=0.5) -> np.ndarray:        "Create numpy array of ones and zeroes with p_hat% as ones"        ones = int(population_size*p_hat + 0.5)        array10 = np.zeros(population_size, dtype=np.uint8)        array10[:ones] = 1                return array10

    def rates_of_samples(population: np.ndarray, sample_size_range: range, runs_per_sample: int                        ) -> list[list[float]]:        "Pass rates for range of sample sizes repeated runs_per_sample times."        # many_samples_many_rates = [( np.random.shuffle(population),         # shuffle every *run*        #                             [population[:s_count].sum() / s_count  # The pass rate for samples        #                             for s_count in sample_size_range]           #                         )[1]                                     # drop the shuffle        #                         for _ in range(runs_per_sample)]         # Every run
        many_samples_many_rates = [[ np.random.shuffle(population),         # shuffle every *sample*                                     [population[:s_count].sum() / s_count  # The pass rate for samples                                     for s_count in sample_size_range]                                      ][1]                                     # drop the shuffle                                   for _ in range(runs_per_sample)]         # Every run
        return list(zip(*many_samples_many_rates))      # Transpose to by_sample_size_then_runs

    population = create_1_0_array(population_size, p_hat)    by_sample_size_then_runs = rates_of_samples(population, sample_size, runs_per_sample)
    # Pass rates within target
    target_pass_range = tmin, tmax = p_hat * (1 - epsilon), p_hat * (1 + epsilon)  # Looking for rates within the range
    range_hits = [sum(tmin <= sample_pass_rate < tmax for sample_pass_rate in single_sample_size)                for single_sample_size in by_sample_size_then_runs]    runs_for_confidence_level = confidence_level * runs_per_sample
    for n_min, conf in zip(sample_size, range_hits):        if conf >= runs_for_confidence_level:            break    else:        n_min = sample_size.start
    for n_max, conf in list(zip(sample_size, range_hits))[::-1]:        if conf <= runs_for_confidence_level:            n_max += sample_size.step  # back a step            break    else:        n_max = sample_size.stop        if (n_min + sample_size.step) >= n_max and sample_size.step > 1:        # Widen        n_max = n_max + sample_size.step + 1
    return range(n_min, n_max, sample_size.step)

def min_max_mid_step(from_range: range) -> tuple[int, int, float, int]:    "Extract from **increasing** from_range the min, max, middle, step"    mn, st = from_range.start, from_range.step        # Handle range where start == stop    mx = from_range.stop    for mx in from_range:        pass        md = (mn + mx) / 2        return mn, mx, md, st    def next_sample_size(new_samples, last_samples,                     runs_per_sample,                     widener=1.33  # Widen range by                    ):    n_min, n_max, n_mid, n_step = min_max_mid_step(new_samples)    l_min, l_max, l_mid, l_step = min_max_mid_step(last_samples)        # Next range of samples computes in names with prefix s_        increase_runs = True    if n_max == l_max:        # Major expand of high end        s_max = l_max + (l_max - l_min)        increase_runs = False    else:        s_max = (n_mid +( n_max - n_mid)* widener)
    if n_min == l_min:        # Major expand of low end        s_min = max(1, l_min + (l_min - l_max))        increase_runs = False    else:        s_min = (n_mid +( n_min - n_mid)* widener)        s_min, s_max = (max(1, int(s_min)), int(s_max + 0.5))    s_step = n_step    if s_min == s_max:        if s_min > 2:            s_min -= 1        s_max += 1            if increase_runs or n_max == n_min:        runs_per_sample *= 2        if n_max == n_min:            s_step = 1        else:            s_step = max(1, (s_step + 1) // 2)  # Go finer        next_sample_range = range(max(1, int(s_min)), int(s_max + 0.5), s_step)        return next_sample_range, runs_per_sample
# %%
if __name__ == '__main__':
    population_size = 100_000           # must be large, > 65K?    sample_size = range(50, 5_000, 512) # Increasing!    p_hat = 0.50                        # Population pass rate, 0.5 == 50%    epsilon = 0.05                      # target margin of error in sample pass rate. 0.05 == +/-5%    confidence_level = 0.95             # sample to be within p_hat +/- epsilon, 0.95 == 95% of the time.    runs_per_sample = 250               # each sample size is run this many time at start, ~250    max_runs_per_sample = 35_000
    while runs_per_sample < max_runs_per_sample:        new_range = sample_search(population_size, sample_size, p_hat, epsilon, confidence_level, runs_per_sample)        n_min, n_max, n_mid, n_step = min_max_mid_step(new_range)        print(f"{n_min} <= n <= {n_max}")        print(f"  For {p_hat, epsilon, confidence_level =}\n"            f"  Using {population_size, sample_size, runs_per_sample =}\n")                    sample_size, runs_per_sample = next_sample_size(new_range, sample_size, runs_per_sample)
print(f" Use a sample n = {n_max} to predict population pass rate of {p_hat*100.:.1f}% +/-{epsilon*100.:.0f}% "      f"with a confidence level of {confidence_level*100.:.0f}%.")

END.


 

Categories: FLOSS Project Planets

Mike Driscoll: One Week Left for Python Logging Book / Course Kickstarter

Fri, 2024-05-10 08:55

My latest Python book campaign is ending in less than a week. This book is about Python’s logging module. I also include two chapters that discuss structlog and loguru.

Support on Kickstarter 

Why Back A Kickstarter?

The reason to back the Kickstarter is that I have exclusive perks there that you cannot get outside of it. Here are some examples:

  • Signed paperback copy of the book
  • Early access to the video course lessons
  • T-shirt with the cover art
  • Exclusive price for Teach Me Python, which includes ALL my self-published books and courses
  • Exclusive price for all my self-published books

Support on Kickstarter

What You’ll Learn

In this book, you will learn how about the following:

  • Logger objects
  • Log levels
  • Log handlers
  • Formatting your logs
  • Log configuration
  • Logging decorators
  • Rotating logs
  • Logging and concurrency
  • and more!
Book formats

The finished book will be made available in the following formats:

  • paperback (at the appropriate reward level)
  • PDF
  • epub

The paperback is a 6″ x 9″ book and is approximately 150 pages long.

Support on Kickstarter 

The post One Week Left for Python Logging Book / Course Kickstarter appeared first on Mouse Vs Python.

Categories: FLOSS Project Planets

Real Python: The Real Python Podcast – Episode #204: Querying OpenStreetMaps via API &amp; Lazy Evaluation in Python

Fri, 2024-05-10 08:00

Would you like to get more practice working with APIs in Python? How about exploring the globe using the data from OpenStreetMap? Christopher Trudeau is back on the show this week, bringing another batch of PyCoder's Weekly articles and projects.

[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]

Categories: FLOSS Project Planets

Django Weblog: Django Developers Survey 2023 results

Fri, 2024-05-10 02:22

In October-November 2023, the Django Software Foundation, in partnership with PyCharm, carried out a survey to capture the preferences and contributions of Django developers worldwide. Today, we’re excited to share the results through detailed infographics highlighting how our community influences the future of web development.

View the Django Developers Survey 2023 report

Why should you check out the infographics?

  1. Discover the latest trends in Django development.
  2. Learn about the tools and technologies preferred by leading developers.
  3. Understand the challenges and opportunities within the Django ecosystem.

Visit the landing page to explore the full report and gain insights that can help shape your projects and strategies in the Django landscape.

Categories: FLOSS Project Planets

Seth Michael Larson: Bringing supply chain security to PyCon US 2024

Thu, 2024-05-09 20:00
Bringing supply chain security to PyCon US 2024 AboutBlogNewsletterLinks Bringing supply chain security to PyCon US 2024

Published 2024-05-10 by Seth Larson
Reading time: minutes

This critical role would not be possible without funding from the Alpha-Omega project. Massive thank-you to Alpha-Omega for investing in the security of the Python ecosystem!

Next week is PyCon US 2024, one of my favorite times of year. If you'll also be in Pittsburgh, reach out to me on Signal (sethmlarson.99) and we'll meet up sometime during the conference.

Here's where you'll find me during the week:


Secure snek 🐍🛡️
  • Talk on "State of Supply Chain Security" with Michael Winser of Alpha-Omega.
  • Open space on Vulnerability Disclosure and Management with Madison Oliver of GitHub Security.
  • Blogger for the Python Language Summit.
  • Spreading security knowledge along with Mike Fiedler, the PyPI Safety and Security Engineer.
  • Working in-person with Python core developers during sprints.

I'll also be bringing along some exclusive "secure snek" stickers, so if you see me at the conference ask about those while supplies last!

If you're interested in security, I recommend considering these other tutorials and talks:

That's all for this week! 👋 If you're interested in more you can read last week's report.

Thanks for reading! ♡ Did you find this article helpful and want more content like it? Get notified of new posts by subscribing to the RSS feed or the email newsletter.

This work is licensed under CC BY-SA 4.0

Categories: FLOSS Project Planets

Python People: Shauna Gordon-McKeon - Open Source Governance, Women's Soccer, and Django

Thu, 2024-05-09 10:41

This is a really fun talk with Shauna.  

We talk about: 

Shauna's technical consulting business is Galaxy Rise Consulting



The Complete pytest Course

★ Support this podcast on Patreon ★ <p>This is a really fun talk with Shauna.  </p><p>We talk about: </p><ul><li>Going from academia to tech</li><li>Django</li><li>Open source project governance and <a href="https://governingopen.com">Governing Open</a></li><li><a href="https://www.nwslsoccer.com/">Womens Soccer</a> and the NWSL</li></ul><p>Shauna's technical consulting business is <a href="http://www.galaxyriseconsulting.com/">Galaxy Rise Consulting</a></p><p><br></p> <br><p><strong>The Complete pytest Course</strong></p><ul><li>Level up your testing skills and save time during coding and maintenance.</li><li>Check out <a href="https://courses.pythontest.com/p/complete-pytest-course">courses.pythontest.com</a></li></ul> <strong> <a href="https://www.patreon.com/PythonPeople" rel="payment" title="★ Support this podcast on Patreon ★">★ Support this podcast on Patreon ★</a> </strong>
Categories: FLOSS Project Planets

Mike Driscoll: Episode 40 – Open Source Development with Antonio Cuni

Thu, 2024-05-09 10:16

In this episode, we discuss working on several different open-source Python packages. Antonio Cuni is our guest, and he chats about his work on PyScript, pdb++, pypy, HPy, and SPy.

Listen in as we chat about Python, packages, open source, and so much more!

Show Links

Here are some of the projects we talked about in the show:

  • The Invent Framework
  • PyScript
  • pdb++ – A drop-in replacement for pdb
  • pypy – The fast, compliant, alternative Python implementation
  • HPy – A better C API for Python
  • SPy – Static Python

The post Episode 40 – Open Source Development with Antonio Cuni appeared first on Mouse Vs Python.

Categories: FLOSS Project Planets

Robin Wilson: New Projects page on my website

Thu, 2024-05-09 05:30

Just a quick post here to say that I’ve added a new Projects page to my freelance website. I realised I didn’t have anywhere online that I could point people to that had links to all of the ‘non-work’ (maybe that should be ‘non-paid’) projects I’ve made.

These projects include my Free GIS Data site, the British Placename Mapper, Py6S and more. I’ve also put together a separate page (linked from the projects page) with all my university theses (PhD, MSc and undergraduate) and other university work – which still get a remarkably high number of downloads.

Have a look here, or see a screenshot of the first few entries below:

Categories: FLOSS Project Planets

Talk Python to Me: #461: Python in Neuroscience and Academic Labs

Thu, 2024-05-09 04:00
Do you use Python in an academic setting? Maybe you run a research lab or teach courses using Python. Maybe you're even a student using Python. Whichever it is, you'll find a ton of great advice in this episode. I talk with Keiland Cooper about how he is using Python at his neuroscience lab at the University of California, Irvine.<br/> <br/> <strong>Episode sponsors</strong><br/> <br/> <a href='https://talkpython.fm/neo4j-notes'>Neo4j</a><br> <a href='https://talkpython.fm/posit'>Posit</a><br> <a href='https://talkpython.fm/training'>Talk Python Courses</a><br/> <br/> <strong>Links from the show</strong><br/> <br/> <div><b>Keiland's website</b>: <a href="https://www.kwcooper.xyz" target="_blank" rel="noopener">kwcooper.xyz</a><br/> <b>Keiland on Twitter</b>: <a href="https://twitter.com/kw_cooper" target="_blank" rel="noopener">@kw_cooper</a><br/> <b>Keiland on Mastodon</b>: <a href="https://fediscience.org/@kwcooper" target="_blank" rel="noopener">@kwcooper@fediscience.org</a><br/> <br/> <b>Journal of Open Source Software</b>: <a href="https://joss.readthedocs.io/en/latest/" target="_blank" rel="noopener">joss.readthedocs.io</a><br/> <b>Avalanche project</b>: <a href="https://avalanche.continualai.org" target="_blank" rel="noopener">avalanche.continualai.org</a><br/> <b>ContinualAI</b>: <a href="https://continualai.org" target="_blank" rel="noopener">continualai.org</a><br/> <b>Executable Books Project</b>: <a href="https://executablebooks.org/en/latest/" target="_blank" rel="noopener">executablebooks.org</a><br/> <b>eLife Journal</b>: <a href="https://elifesciences.org/about/" target="_blank" rel="noopener">elifesciences.org</a><br/> <b>Watch this episode on YouTube</b>: <a href="https://www.youtube.com/watch?v=rad6Kd6J0ns" target="_blank" rel="noopener">youtube.com</a><br/> <b>Episode transcripts</b>: <a href="https://talkpython.fm/episodes/transcript/461/python-in-neuroscience-and-academic-labs" target="_blank" rel="noopener">talkpython.fm</a><br/> <br/> <b>--- Stay in touch with us ---</b><br/> <b>Subscribe to us on YouTube</b>: <a href="https://talkpython.fm/youtube" target="_blank" rel="noopener">youtube.com</a><br/> <b>Follow Talk Python on Mastodon</b>: <a href="https://fosstodon.org/web/@talkpython" target="_blank" rel="noopener"><i class="fa-brands fa-mastodon"></i>talkpython</a><br/> <b>Follow Michael on Mastodon</b>: <a href="https://fosstodon.org/web/@mkennedy" target="_blank" rel="noopener"><i class="fa-brands fa-mastodon"></i>mkennedy</a><br/></div>
Categories: FLOSS Project Planets

Trey Hunner: My favorite Python 3.13 feature

Wed, 2024-05-08 16:30

Python 3.13 just hit feature freeze with the first beta release today.

Just before the feature freeze, a shiny new feature was added: a brand new Python REPL. ✨

This new Python REPL is will likely be my favorite thing about 3.13. It’s definitely the feature I’m most looking forward to using while teaching after 3.13.0 final is released later this year.

I’d like to share what’s so great about this new REPL and what additional improvements I’m hoping we might see in future Python releases.

Little niceties

The first thing you’ll notice when you launch the new REPL is the colored prompt.

You may also notice that as you type a block of code, after the first indented line, the next line will be auto-indented! Additionally, hitting the Tab key inserts 4 spaces now, which means there’s no more need to ever hit Space Space Space Space to indent ever again.

At this point you might be thinking, “wait did I accidentally launch ptpython or some other alternate REPL?” But it gets even better!

You can “exit” now

Have you ever typed exit at the Python REPL? If so, you’ve seen a message like this:

1 2 >>> exit Use exit() or Ctrl-D (i.e. EOF) to exit

That feels a bit silly, doesn’t it? Well, typing exit will exit immediately.

Typing help also enters help mode now (previously you needed to call help() as a function).

Block-level history

The feature that will make the biggest different in my own usage of the Python REPL is block-level history.

I make typos all the time while teaching. I also often want to re-run a specific block of code with a couple small changes.

The old-style Python REPL stores history line-by-line. So editing a block of code in the old REPL required hitting the up arrow many times, hitting Enter, hitting the up arrow many more times, hitting Enter, etc. until each line in a block was chosen. At the same time you also needed to make sure to edit your changes along the way… or you’ll end up re-running the same block with the same typo as before!

The ability to edit a previously typed block of code is huge for me. For certain sections of my Python curriculum, I hop into ptpython or IPython specifically for this feature. Now I’ll be able to use the default Python REPL instead.

Pasting code just works

The next big feature for me is the ability to paste code.

Check this out:

Not impressed? Well, watch what happens when we paste that same block of code into the old Python REPL:

The old REPL treated pasted text the same as manually typed text. When two consecutive newlines were encountered in the old REPL, it would end the current block of code because it assumed the Enter key had been pressed twice.

The new REPL supports bracketed paste, which is was invented in 2002 and has since been adopted by all modern terminal emulators.

No Windows support? Curses!

Unfortunately, this new REPL doesn’t currently work on Windows. This new REPL relies on the curses and readline modules, neither of which are available on Windows. I’m hoping that this new REPL might encourage the addition of curses support on Windows (there are multiple issues discussing this).

The in-browser Python REPL on Python Morsels also won’t be able to use the new REPL because readline and curses aren’t available in the WebAssembly Python build.

Beta test Python 3.13 to try out the new REPL 💖

Huge thanks to Pablo Galindo Salgado, Łukasz Langa, and Lysandros Nikolaou for implementing this new feature! And thanks to Michael Hudson-Doyle and Armin Rigo for implementing the original version of this REPL, which was heavily borrowed from PyPy’s pyrepl project.

The new Python REPL coming in 3.13 is a major improvement over the old REPL. While the lack of Windows support is disappointing, but I’m hopeful that a motivated Windows user will help add support eventually!

Want to try out this new REPL? Download and install Python 3.13.0 beta 1!

Beta testing new Python releases helps the Python core team ensure the final release of 3.13.0 is as stable and functional as possible. If you notice a bug, check the issue tracker to see if it’s been reported yet and if not report it!

Categories: FLOSS Project Planets

Python Insider: Python 3.13.0 beta 1 released

Wed, 2024-05-08 14:11

I'm pleased to announce the release of Python 3.13 beta 1 (and feature freeze for Python 3.13).

https://www.python.org/downloads/release/python-3130b1/

 

This is a beta preview of Python 3.13

Python 3.13 is still in development. This release, 3.13.0b1, is the first of four beta release previews of 3.13.

Beta release previews are intended to give the wider community the opportunity to test new features and bug fixes and to prepare their projects to support the new feature release.

We strongly encourage maintainers of third-party Python projects to test with 3.13 during the beta phase and report issues found to the Python bug tracker as soon as possible. While the release is planned to be feature complete entering the beta phase, it is possible that features may be modified or, in rare cases, deleted up until the start of the release candidate phase (Tuesday 2024-07-30). Our goal is to have no ABI changes after beta 4 and as few code changes as possible after 3.13.0rc1, the first release candidate. To achieve that, it will be extremely important to get as much exposure for 3.13 as possible during the beta phase.

Please keep in mind that this is a preview release and its use is not recommended for production environments.

Major new features of the 3.13 series, compared to 3.12

Some of the new major new features and changes in Python 3.13 are:

New features Typing Removals and new deprecations
  • PEP 594 (Removing dead batteries from the standard library) scheduled removals of many deprecated modules: aifc, audioop, chunk, cgi, cgitb, crypt, imghdr, mailcap, msilib, nis, nntplib, ossaudiodev, pipes, sndhdr, spwd, sunau, telnetlib, uu, xdrlib, lib2to3.
  • Many other removals of deprecated classes, functions and methods in various standard library modules.
  • C API removals and deprecations. (Some removals present in alpha 1 were reverted in alpha 2, as the removals were deemed too disruptive at this time.)
  • New deprecations, most of which are scheduled for removal from Python 3.15 or 3.16.

(Hey, fellow core developer, if a feature you find important is missing from this list, let Thomas know.)

For more details on the changes to Python 3.13, see What’s new in Python 3.13. The next pre-release of Python 3.13 will be 3.13.0b2, currently scheduled for 2024-05-28.

 More resources  Enjoy the new releases

Thanks to all of the many volunteers who help make Python Development and these releases possible! Please consider supporting our efforts by volunteering yourself or through organization contributions to the Python Software Foundation.

Your release team,
Thomas Wouters
Łukasz Langa
Ned Deily
Steve Dower 

 

Categories: FLOSS Project Planets

Daniel Roy Greenfeld: TIL: Running UV outside a virtualenv

Wed, 2024-05-08 11:22

Breaking the rules to satisfy continuous integration.

A few months ago I blogged about forcing pip to require a virtualenv. However, when automating tests and deployments sometimes you work outside of virtualenvs. With pip this isn't a problem, you just don't set what I did in that article. However, what if you are using the rust-based uv where the default is to keep you in a virtualenv?

The answer is when you install dependencies using uv in this scenario, use the --python flag to specify the interpreter. According to the uv docs, this flag is intended for use in continuous integration (CI) environments or other automated workflows.

So without further ado, this is what I did:

python -m pip install uv uv pip install -p 3.12 -r requirements.txt

As a bonus, here's the command inside GitHub actions-flavored YAML:

- name: Install Dependencies run: | python -m pip install uv uv pip install -p 3.12 -r requirements.txt

Want to know how to handle multiple versions of Python? Here's how use a matrix on GitHub: https://github.com/pydanny/dj-notebook/blob/main/.github/workflows/python-ci.yml#L18-L19

Categories: FLOSS Project Planets

The Python Show: 40 - Open Source Development with Antonio Cuni

Wed, 2024-05-08 10:28

In this episode, we discuss working on several different open-source Python packages. Antonio Cuni is our guest, and he chats about his work on PyScript, pdb++, pypy, HPy, and SPy.

Listen in as we chat about Python, packages, open source, and so much more!

Show Links

Here are some of the projects we talked about in the show:

Categories: FLOSS Project Planets

Pages