Planet Python

Subscribe to Planet Python feed
Planet Python -
Updated: 2 min 47 sec ago

CodersLegacy: Cython vs CPython – Comparing the Speed Difference

Sun, 2023-06-11 12:36

In this Cython vs CPython Article, we will be conducting a speed comparison using 10 different benchmarks, covering diverse scenarios and edge cases.

Python is a popular programming language known for its simplicity and readability. However, it is an interpreted language, which can sometimes result in slower execution speeds compared to compiled languages like C. To address this limitation, developers have introduced Cython, an optimizing static compiler for Python. Cython allows you to write Python-like code that can be compiled to C, offering potential performance gains.

Table Of Contents
  1. Testing Environment
  2. Benchmark 1: Fibonacci Sequence
  3. Benchmark 2: Fibonacci Sequence (Iterative)
  4. Benchmark 3: Matrix Multiplication
  5. Benchmark 4: Prime Number Generation
  6. Benchmark 5: String Concatenation
  7. Benchmark 6: Computing Column Means
  8. Benchmark 7: Computing Column Means (Unoptimized)
  9. Benchmark 8: Arithmetic Operations
  10. Benchmark 9: File Operations
  11. Benchmark 10: Linear Searching
  12. Benchmark 11: Bubble Sorting
  13. Conclusion:
Testing Environment

Before examining any benchmarks, you should be aware of what environment and what methods were used to conduct the testing. This helps in reproducibility, and in gaining a better understanding of the results (as results will vary from platform to platform).

Python version: 3.10

Hardware: Ryzen 7 5700U + 16GB RAM + SSD

Operating System: Windows 11

Benchmark Library: Timeit

Technique: We used the Python timeit library to apply several techniques to reduce randomness and improve our timing accuracy. The timeit library allows us to repeat the benchmarks “x” number of times, which we can then average out to reduce randomness (controlled by the repeat parameter). We can also consecutively perform the benchmarks “x” number of times and add all their times together to further reduce randomness (controlled by the number parameter).

Here is a code snippet of our testing code from one of our benchmarks.

import timeit import program1_cy from statistics import mean python= mean(timeit.repeat('program1.fibonacci(25)', setup='import program1', number=10, repeat=10)) cython= mean(timeit.repeat('program1_cy.fibonacci(25)', setup='import program1_cy', number=10, repeat=10)) print(f"Python Time: {python:.10f}") print(f"Cython Time: {cython:.10f}") if python/cython > 1: print(f"Cython was {python/cython:.3f} times faster") else: print(f"Cython was {python/cython:.3f} times slower") Benchmark 1: Fibonacci Sequence

Python Code:

def fibonacci(n): if n <= 1: return n else: return fibonacci(n-1) + fibonacci(n-2)

Cython Code:

def fibonacci(int n): if n <= 1: return n else: return fibonacci(n-1) + fibonacci(n-2)

Benchmark#1 Result

Both the Python and Cython versions of the Fibonacci sequence use recursive calls to calculate the value. However, the Cython code, with the explicit declaration of the integer type, allows for more efficient execution and avoids the interpreter overhead of Python, resulting in faster execution.

10th Fibonacci Number25th Fibonacci NumberPython Time: 0.0001332700
Cython Time: 0.0000027100
Cython was 49.177 times fasterPython Time: 0.3970897400
Cython Time: 0.0028510200
Cython was 139.280 times faster

Cython is also just alot better at handling recursion than Python, which is why we see such a big performance boost. Doing this on the non-recursion version will not yield such a large difference, as we will see in the next benchmark.

Overall: Cython Wins

Benchmark 2: Fibonacci Sequence (Iterative)

Python Code:

def fib(n): n1, n2 = 0, 1 for i in range(1, n + 1): temp = n1 + n2 n1 = n2 n2 = temp return n2

Cython Code:

cpdef int fib(int n): cdef int n1 = 0 cdef int n2 = 1 cdef int temp, i for i in range(1, n + 1): temp = n1 + n2 n1 = n2 n2 = temp return n2

Benchmark#2 Result:

Here we can observe some massive speedups as well. Not as good as the recursive fibonacci (compare the 10th fibonacci benchmarks). We can’t go above 25th fibonacci number in recursive fibonacci, because it is incredibly slow (especially in a language like Python).

10th Fibonacci Number100th Fibonacci Number10000th Fibonacci NumberPython Time: 0.0000183300
Cython Time: 0.0000019300
Cython was 9.497 times fasterPython Time: 0.0060053901
Cython Time: 0.0000407100
Cython was 147.516 times fasterPython Time: 1.5474137500
Cython Time: 0.0002609300
Cython was 5930.379 times faster

Overall: Cython Wins

Benchmark 3: Matrix Multiplication

Python Code:

import numpy as np def matrix_multiply(a, b): return, b)

Cython Code:

import numpy as np cimport numpy as np cpdef np.ndarray[np.float64_t, ndim=2] matrix_multiply(np.ndarray[np.float64_t, ndim=2] a, np.ndarray[np.float64_t, ndim=2] b): return, b)

Benchmark#3 Result:

In this benchmark, both the Python and Cython codes use NumPy’s dot product function to perform matrix multiplication. Cython here actually performs worse than the native python code.

5×5 Matrix100×100 Matrix500×500 MatrixPython Time: 0.0000171100
Cython Time: 0.0000182500
Cython was 0.938 times fasterPython Time: 0.0027420900
Cython Time: 0.0025540200
Cython was 1.074 times fasterPython Time: 0.0296036600
Cython Time: 0.0290222200
Cython was 1.020 times faster

The explanation for this result, would be that numpy is already a highly optimized library written in C/C++. Thus, it has similar performance to Cython, but without the overhead Cython has, causing it the native Python version to take the lead.

As the size of the matrix increases, we can see Cython overtake the native Python version a bit. This is because the overhead of using Cython is becoming smaller (in comparison to the time taken for the operation)

We can learn from this, that Cython will not help too much when using already optimized libraries such as numpy. If we were to remove numpy arrays here, and use normal python lists, perhaps the results would change significantly.

Overall: Draw

Benchmark 4: Prime Number Generation

Python Code:

def generate_primes(n): primes = [] for num in range(2, n): if all(num % i != 0 for i in range(2, int(num**0.5) + 1)): primes.append(num) return primes

Cython Code:

cpdef generate_primes(int n): cdef list primes = [] cdef int num cdef int i for num in range(2, n): for i in range(2, int(num**0.5) + 1): if num % i == 0: break else: primes.append(num) return primes

Benchmark#4 Result:

The Cython version of the prime number generation code benefits from the use of static typing. By declaring the variable types explicitly, the Cython code eliminates the dynamic type checking overhead of Python.

All primes till 100All primes till 10000Python Time: 0.0017563000
Cython Time: 0.0001182100
Cython was 14.857 times fasterPython Time: 0.3009807000
Cython Time: 0.0220213001
Cython was 13.668 times faster

This leads to significant speed improvements, especially when dealing with large prime numbers. The larger the prime number is, the more iterations are needed. Loops and Iterations benefit greatly from static type checking, as can be seen here.

Interestingly however, the rate of improvement does not improve as we increase the range of prime numbers.

Overall: Cython Wins

Benchmark 5: String Concatenation

Python Code:

def concatenate_strings(a, b): return a + b

Cython Code:

cpdef str concatenate_strings(str a, str b): return a + b

Benchmark#5 Result:

The string concatenation benchmark demonstrates a small difference between Python and Cython. Since both Python and Cython handle string operations in a similar manner, the performance gains are not very significant. In fact, performance seems to decrease as the length of strings increase.

100 length strings1000 length stringsPython Time: 0.0000041500
Cython Time: 0.0000029400
Cython was 1.412 times fasterPython Time: 0.0000084700
Cython Time: 0.0000071500
Cython was 1.185 times faster

Overall: Cython Wins

Benchmark 6: Computing Column Means

Python Code:

import numpy as np def compute_column_means(data): num_cols = len(data[0]) means = np.average(data, axis=1) return means

Cython Code:

import numpy as np cimport numpy as np def compute_column_means(np.ndarray[np.float64_t, ndim=2] data): cdef int num_cols = data.shape[1] cdef np.ndarray[np.float64_t] means = np.zeros(num_cols) means = np.average(data, axis=1) return means

Benchmark#6 Result:

The above code was deliberately designed to be as optimized as possible using numpy and some of it’s highly optimized functions (written in C/C++).

rows = 100, cols = 10rows = 1000, cols = 100Python Time: 0.0002119100
Cython Time: 0.0002120200
Cython was 0.999 times fasterPython Time: 0.0008491300
Cython Time: 0.0008754800
Cython was 0.970 times faster

By looking at the results, we can observe that Cython is slower (overhead). This is the second benchmark where we have observed that highly optimizing our code renders any beenft by Cython null.

Overall: Draw

Benchmark 7: Computing Column Means (Unoptimized)

Python Code:

def compute_column_means(data): num_cols = len(data[0]) means = [0.0] * num_cols for col in range(num_cols): total = 0.0 for row in data: total += row[col] means[col] = total / len(data) return means

Cython Code:

cpdef list compute_column_means(data): cdef int num_cols = len(data[0]) cdef list means = [0.0] * num_cols cdef double total cdef list row cdef int col for col in range(num_cols): total = 0.0 for row in data: total += row[col] means[col] = total / len(data) return means

Benchmark#7 Result:

What we have done here, is created an unoptimized version of Benchmark#6 without the use of numpy. Now we will observe that Cython pulls ahead of regular Python by a significant margin.

row = 100, col = 10row = 1000, col = 10Python Time: 0.0006929700
Cython Time: 0.0005150800
Cython was 1.345 times fasterPython Time: 0.0864737500
Cython Time: 0.0592738300
Cython was 1.459 times faster

Overall: Cython Wins

Benchmark 8: Arithmetic Operations

Python Code:

import math def compute_math(): result = 0 for i in range(10_000_000): result += math.sin(i) + math.cos(i) return result

Cython Code:

import math def compute_math(): cdef double result = 0 for i in range(10_000_000): result += math.sin(i) + math.cos(i) return result

Benchmark#8 Result:

Here we have compiled a few common arithmetic and geometric operations, without the presence of loops. Some operations like division are actually more computationally expensive than you think. The goal of this benchmark was to check whether the use of Cython can speed these up (which it clearly can).

Mathematical ComputationsPython Time: 0.0000170500
Cython Time: 0.0000124900
Cython was 1.365 times faster

Overall: Cython Wins

Benchmark 9: File Operations

Python Code:

def read_file(filename): with open(filename, 'r') as f: contents = return contents

Cython Code:

def read_file(filename): cdef str contents with open(filename, 'r') as f: contents = return contents

Benchmark#9 Result:

In the file operations benchmark, both Python and Cython exhibit similar performance since the file reading operation itself relies on low-level system calls. Therefore, the performance difference between the two is negligible. The overhead actually causes Cython to be a bit slower than native Python.

500 words text file5000 words text filePython Time: 0.0016529600
Cython Time: 0.0018463700
Cython was 0.895 times fasterPython Time: 0.0042956000
Cython Time: 0.0040852899
Cython was 1.051 times faster

Strangely enough, at 5000+ words the speed gap between Cython and Python decreased quite a bit, and Cython even outperformed Python sometimes (after running the tests multiple times). This might be because of the overhead becoming negligible.

Overall: Draw

Benchmark 10: Linear Searching

Python Code:

def linear_search(lst, target): for i, element in enumerate(lst): if element == target: return i return -1

Cython Code:

cpdef int linear_search(list lst, int target): cdef int n = len(lst) for i in range(n): if lst[i] == target: return i return -1

Benchmark#10 Result:

Here we have a program for linear searching. This involves no arithmetic operations, just some loops and a single comparison operations. This is the kind of code which really benefits from the use of Cython.

1000 numbers10,000 numbers100,000 numbersPython Time: 0.0012754300
Cython Time: 0.0000313500
Cython was 40.684 times fasterPython Time: 0.0041481200
Cython Time: 0.0000133900
Cython was 309.792 times fasterPython Time: 0.0438640100
Cython Time: 0.0000172000
Cython was 2550.231 times faster

Overall: Cython Wins

Benchmark 11: Bubble Sorting

Python Code:

def bubbleSort(arr): n = len(arr) for i in range(n-1): for j in range(0, n-1): if arr[j] > arr[j+1] : temp = arr[j] arr[j]= arr[j+1] arr[j + 1] = temp return arr

Cython Code:

cpdef list bubbleSort(list arr): cdef int n = len(arr) cdef int i, j for i in range(n-1): for j in range(0, n-1): if arr[j] > arr[j+1] : temp = arr[j] arr[j]= arr[j+1] arr[j + 1] = temp return arr

Benchmark#11 Result:

Here we have the classic bubblesort algorithm. As we can see, Cython is able to complete these operations several times faster than regular Python. Another great place to be using Cython, especially because using libraries like numpy will not help much over here (since the majority of the work goes into iterations and comparisons).

100 numbers1000 numbersPython Time: 0.0053411199
Cython Time: 0.0014683400
Cython was 3.638 times fasterPython Time: 0.2775652600
Cython Time: 0.0572400799
Cython was 4.849 times faster

Sorting algorithms which use recursion will benefit even more from the use of Cython, as we saw in our very first benchmark.


In conclusion, Cython offers significant speed improvements over CPython in certain scenarios. By leveraging static typing, explicit variable declaration, and efficient use of libraries like NumPy, Cython can eliminate the interpreter overhead and enhance performance.

However, it is important to note that not all benchmarks exhibit substantial speed gains with Cython. The choice between Python and Cython depends on the specific requirements of the application, the complexity of the code, and the need for performance optimization.

Code that is already heavily optimized or uses C/C++ libraries under the hood will not see significant performance boosts.

I would recommend programmers to focus more on actually optimizing their code first, before resorting to techniques like Cython. Cython does not negatively impact performance (overhead is negligible in large operations), and thus can be applied at the very end as the cherry on top.

This marks the end of the Cython vs CPython Speed Comparison. Any suggestions or contributions for CodersLegacy are more than welcome. Questions regarding the tutorial content be asked in the comments section below.

The post Cython vs CPython – Comparing the Speed Difference appeared first on CodersLegacy.

Categories: FLOSS Project Planets

Daniel Roy Greenfeld: Converting from bleach to nh3

Fri, 2023-06-09 19:45

Bleach is deprecated, here's how to come close to replicating bleach.clean() using the nh3 version of .clean().

import nh3 def clean_string(string: str) -> str: # The arguments below being passed to `nh3.clean()` are # the default values of the `bleach.clean()` function. return nh3.clean( string, tags={ "a", "abbr", "acronym", "b", "blockquote", "code", "em", "i", "li", "ol", "strong", "ul", }, attributes={ "a": {"href", "title"}, "abbr": {"title"}, "acronym": {"title"}, }, url_schemes={"http", "https", "mailto"}, link_rel=None, )

The big difference is unlike the safing of HTML done by bleach, nh3 removes the offending tags altogether. Read the comments below to see what this means.


>>> input_from_user = """<b> <img src=""> I\'m not trying to XSS you <a href="">Link</a> </b>""" >>> >>> # By default, bleach version safes the HTML >>> # rather than remove the tags altogether. >>> bleach.clean(input_from_user) '<b>&lt;img src=""&gt;I\'m not trying to XSS you <a href="">Link</a></b>' >>> >>> # In contrast, nh3 removes the offending tags entirely >>> # while also preserving whitespace. >>> clean_string(input_from_user) '<b>\n\nI\'m not trying to XSS you <a href="">Link</a>\n</b>'

Advantages of switching to nh3 are:

  1. nh3 is actively maintained, bleach is officially deprecated.
  2. I believe the nh3 technique of stripping tags rather than allowing safing is more secure. The idea of safing is great, but I've always wondered if a creative attacker could find a way to exploit it. So I think it is better to remove the offending tags altogether.
  3. The preservation of whitespace is really useful for preserving content submitted in a textarea. This is especially true for Markdown content.
  4. nh3 is a binding to the rust-ammonia project. They claim a 15x speed increase over bleach's binding to the html5lib project. Even if that is a 3x exaggeration, that's still a 5x speed increase.
Categories: FLOSS Project Planets

William Minchin: Selecting a Code of Conduct for My Software Projects

Fri, 2023-06-09 12:36

Towards the end of 2011 (in the depths of Covid…), I started thinking about adding a code of conduct to my open source software projects. Github recommends adding one, somewhat similiar to how they recommend including a software license.

In trying to pick a code of conduct (for my projects), it seems helpful to remember the “community”, as such, is often basically me, short (in length) is generally better, and just about anything can be weaponized by bad faith actors.

The Most Basic: Troll Banning

I suppose the most basic form of a code of conduct is just have a (written) policy to ban any trolls, or “Don’t be a jerk!”. It seems almost ridiculous that you would have to spell this out. But, I suppose as you start dealing with a wider array of people, it’s helpful to outline what behaviour isn’t appreciated (e.g. “Doing X makes you a jerk; don’t do it!”).

The Most Complex: Corporate Codes of Conduct

When searching Google for examples of codes of conduct, the corporate variety was by far the most common that came up; one could argue this is the “true” definition of the term. These tend to be very legalistic (being written by the legal department), very long (often hundreds of pages), and often very complex. But I don’t need all that: I’m don’t need to deal with travel reimbursements, I don’t need to deal with conflicts of interest, etc. I doubt anyone want to contribute a small fix to my software projects will read something that long, and I don’t want to spend the next three months (or three years!) writing something like this either.

The Most Common: The Contributor Code of Conduct

The Contributor Code of Conduct (aka the “Contributor Covenant”) is probably the most commonly used Code of Conduct for software projects and seems to have the largest mindshare; to some extent, it feels like the injunction to “add a code of conduct” is an injunction to bind yourself and your project by the Contributor Code of Conduct, whether you adopt that specific code or not.

It is among the three codes of conduct suggested by GitHub1, although the three of them seem similiar in nature. One of the three actually says the “primary goal of {COMMUNITY_NAME} is to be inclusive to the largest number of contributors, with the most varied and diverse backgrounds possible.”2…and I’m not sure that’s a useful goal for any (functional) group. For example, if you take any marketing, they will suggest having a “target audience” or an ideal customer to possition your project for. As a practical matter, if everyone is accepted, what is the common goal or purpose to hold the group together? As you deal with others, it’s fairly obvious that some people are more productive contributors, and some are more excited about the project; both things that I feel should be encouraged.

For me, the goal is to actually have a working piece of software, for me first of all. I don’t see how these codes of conduct help make that a reality.

The SQLite Option: The Benedictine Code

Presumably other (software/open source) codes of conduct have been developed (though none seem particularly popular), but one of the most interesting (to me) is the story of SQLite and their Code of Conduct adopted from the Benedictine Code. SQlite, for various corporate contracts, was asked to link to their project’s code of conduct, which at the time they lacked. Looking around, they decided to adopt the Benedictine Code, specifically Chapter 4. This Chapter is a list of 73 tools for good works and was originally written for the Benedictine monks ca. 530AD, and has been a foundation of their Order since. In reading through the list, it felt more like what I myself was looking for. In particular, it seems to focus almost entirely on the actions I want to see in the community.

There has been much critism leveled of it, mostly seeming to center on its religious nature. I don’t feel the Code itself is particularly religious, although it was written for a group of religious believers who are trying to better live their (shared) faith. Personally, it seems that some people mistake the mention of religion as implying that the text is a religious text, when it more often that the writers lived in a religious society (such as in this case). Perhaps the right response to these critisms is to request that other codes of conduct add religious identity and religious expression to their list of prohitied discrimination grounds (to mirror the current listing of “gender, gender identity, and gender expression”); often enough, religious people are asked to keep their faith out of sight as a condition of having it at all. Although the principals at SQLite do seem to be religious, they have also been clear that your (personal) religion will not bar you from participating in the project: in their introduction they explain that you are not required believe, agree with, or even accept the Code to participate in the project.

Another complaint has been that it doesn’t lay out an enforcement mechanism. Except that it does, asking that those who have failed to live up to the ideals of the Code to be extended grace; for a first pass of many issues, this seems a reasonable response.

And the mention of chastity seems like a brillant way of heading off the sexual harassment and assult concerns that can be the most impactful for a community to protect against.

For Me

For my projects, I’ll be using a version of the Benedictine Code. It seems short enough that people may actually read it, it promotes the things I want to see in the community (rather than just being a list of horrible things people might do to each other), and its references to grace seem like a decent response to the misunderstanding most commonly encountered. Here’s the text:

Code of Conduct Purpose

The Founder of this project, and all of the current developers at the time when this document was composed, have pledged to govern their interactions with each other and with the larger this project’s user community in accordance with the “instruments of good works” from chapter 4 of The Rule of St. Benedict (hereafter: “The Rule”). This code of ethics has proven its mettle in thousands of diverse communities for over 1,500 years, and has served as a baseline for many civil law codes since the time of Charlemagne.

Scope of Application

No one is required to follow The Rule, to know The Rule, or even to think that The Rule is a good idea. The Founder of this project believes that anyone who follows The Rule will live a happier and more productive life, but individuals are free to dispute or ignore that advice if they wish.

The Founder of this project and all current developers have pledged to follow the spirit of The Rule to the best of their ability. They view The Rule as their promise to all project users of how the developers are expected to behave. This is a one-way promise, or covenant. In other words, the developers are saying: “We will treat you this way regardless of how you treat us.”

The Rule
  1. First of all, love the Lord God with your whole heart, your whole soul, and your whole strength.
  2. Then, love your neighbor as yourself.
  3. Do not murder.
  4. Do not commit adultery.
  5. Do not steal.
  6. Do not covet.
  7. Do not bear false witness.
  8. Honor all people.
  9. Do not do to another what you would not have done to yourself.
  10. Deny oneself in order to follow Christ.
  11. Chastise the body.
  12. Do not become attached to pleasures.
  13. Love fasting.
  14. Relieve the poor.
  15. Clothe the naked.
  16. Visit the sick.
  17. Bury the dead.
  18. Be a help in times of trouble.
  19. Console the sorrowing.
  20. Be a stranger to the world’s ways.
  21. Prefer nothing more than the love of Christ.
  22. Do not give way to anger.
  23. Do not nurse a grudge.
  24. Do not entertain deceit in your heart.
  25. Do not give a false peace.
  26. Do not forsake charity.
  27. Do not swear, for fear of perjuring yourself.
  28. Utter only truth from heart and mouth.
  29. Do not return evil for evil.
  30. Do no wrong to anyone, and bear patiently wrongs done to yourself.
  31. Love your enemies.
  32. Do not curse those who curse you, but rather bless them.
  33. Bear persecution for justice’s sake.
  34. Be not proud.
  35. Be not addicted to wine.
  36. Be not a great eater.
  37. Be not drowsy.
  38. Be not lazy.
  39. Be not a grumbler.
  40. Be not a detractor.
  41. Put your hope in God.
  42. Attribute to God, and not to self, whatever good you see in yourself.
  43. Recognize always that evil is your own doing, and to impute it to yourself.
  44. Fear the Day of Judgment.
  45. Be in dread of hell.
  46. Desire eternal life with all the passion of the spirit.
  47. Keep death daily before your eyes.
  48. Keep constant guard over the actions of your life.
  49. Know for certain that God sees you everywhere.
  50. When wrongful thoughts come into your heart, dash them against Christ immediately.
  51. Disclose wrongful thoughts to your spiritual mentor.
  52. Guard your tongue against evil and depraved speech.
  53. Do not love much talking.
  54. Speak no useless words or words that move to laughter.
  55. Do not love much or boisterous laughter.
  56. Listen willingly to holy reading.
  57. Devote yourself frequently to prayer.
  58. Daily in your prayers, with tears and sighs, confess your past sins to God, and amend them for the future.
  59. Fulfill not the desires of the flesh; hate your own will.
  60. Obey in all things the commands of those whom God has placed in authority over you even though they (which God forbid) should act otherwise, mindful of the Lord’s precept, “Do what they say, but not what they do.”
  61. Do not wish to be called holy before one is holy; but first to be holy, that you may be truly so called.
  62. Fulfill God’s commandments daily in your deeds.
  63. Love chastity.
  64. Hate no one.
  65. Be not jealous, nor harbor envy.
  66. Do not love quarreling.
  67. Shun arrogance.
  68. Respect your seniors.
  69. Love your juniors.
  70. Pray for your enemies in the love of Christ.
  71. Make peace with your adversary before the sun sets.
  72. Never despair of God’s mercy.

(The header image is “St. Benedict delivering his Rule to St. Maurus and other monks of his order”, from a manuscript from Monastery of St. Gilles in Nîmes, France, dated 1129. The image is copied from Wikipedia Commons )

  1. the Contributor Covent, the Django Code of Conduct and the Citizen Code of Conduct are the three codes of conduct, from GitHub’s Open Source Guide

  2. “Purpose” (the first section) of the Citizen Code of Conduct 

Categories: FLOSS Project Planets

Python Engineering at Microsoft: Python in Visual Studio Code – June 2023 Release

Fri, 2023-06-09 12:04

We’re excited to announce that the June 2023 release of the Python and Jupyter extensions for Visual Studio Code are now available!

This release includes the following announcements:

  • Test Discovery and Execution Rewrite
  • Run Python File in Dedicated Terminal
  • Preview: Intellisense support for overloaded operators
  • Configurable indexing limits with Pylance

If you’re interested, you can check the full list of improvements in our changelogs for the Python, Jupyter and Pylance extensions.

Test Discovery and Execution Rewrite

This month, we are beginning the roll out of our testing rewrite behind an experimental feature. This rewrite redesigns the architecture behind test discovery and execution for both unittest and pytest in the extension. While it does not provide any additional functionality exposed to the user, it provides a faster and more stable experience, and opens up new functionality opportunities moving forward. The rewrite will be rolled out behind the experimental "pythonTestAdapter" flag, which you can opt into with "python.experiments.optInto" in your settings.json. Eventually, we plan to remove the setting and adopt this new architecture. If you have any comments or suggestions regarding this experiment or rewrite, please share them in the vscode-python repo.

Run Python File in Dedicated Terminal

UPDATE (13 June 2023) – This feature has been rolled back due to a bug tracked by vscode-python#21393.

The Python extension will now create a new terminal for each file you run using the run button in the top right corner of the editor or the Python: Run Python File in Terminal command. This also means the Python extension will keep using this file’s “dedicated” terminal every time you re-run the file.

Any time you wish to run the same file in a separate terminal, you can run select Python: Run Python File in Dedicated Terminal under the run button menu.


Preview: IntelliSense support for overloaded operators with Pylance

Overloaded operators allow you to redefine the behavior of built-in operators for your custom objects or data types. When using the latest pre-release version of the Pylance extension, you are now able to use IntelliSense to explore and utilize overloaded operators with ease and efficiency.

This functionality provides code completion, parameter information, and signature help for overloaded operators, whether you’re working with mathematical vectors, complex numbers, or any other custom classes.

Configurable indexing limits with Pylance

There’s a new Pylance setting that allows you to configure the file count limit for indexing: "python.analysis.userFileIndexingLimit", which is set to 2000 by default. This setting can be particularly helpful when you’re working with very large projects and are willing to compromise performance for an enhanced IntelliSense experience.

Other Changes and Enhancements

We have also added small enhancements and fixed issues requested by users that should improve your experience working with Python and Jupyter Notebooks in Visual Studio Code. Some notable changes include:

  • New experimental createEnvironment.contentButton setting to disable the Create Environment button in dependency files (vscode-python#21212)
  • Detect installed packages in the selected environment (vscode-python#21231)
  • New python.analysis.inlayHints.callArgumentNames setting to enable inlay hints for call argument names with Pylance

We would also like to extend special thanks to this month’s contributors:

Try out these new improvements by downloading the Python extension and the Jupyter extension from the Marketplace, or install them directly from the extensions view in Visual Studio Code (Ctrl + Shift + X or ⌘ + ⇧ + X). You can learn more about Python support in Visual Studio Code in the documentation. If you run into any problems or have suggestions, please file an issue on the Python VS Code GitHub page.

The post Python in Visual Studio Code – June 2023 Release appeared first on Python.

Categories: FLOSS Project Planets

Kay Hayen: Nuitka Release 1.6

Fri, 2023-06-09 09:41

This is to inform you about the new stable release of Nuitka. It is the extremely compatible Python compiler, “download now”.

This release bumps the much awaited 3.11 support to full level. This means Nuitka is now expected to behave identical to CPython3.11 for the largest part.

There is plenty of new features in Nuitka, e.g. a new testing approach with reproducible compilation reports, support for including the metadata if an distribution, and more.

In terms of bug fixes, it’s also huge, and esp. macOS got a lot of improvements that solve issues with prominent packages in our dependency detection. And then for PySide we found a corruption issue, that got workarounds.

Bug fixes
  • The new dict in optimization was compile time crashing on code where the dictionary shaped value checked for a key was actually an conditional expression

    # Was crashing "value" in ``some_dict if condition else other_dict``

    Fixed in 1.5.1 already.

  • Standalone: Added support for openvino. This also required to make sure to keep used DLLs and their dependencies in the same folder. Before they were put on the top level. Fixed in 1.5.1 already.

  • Android: Convert RPATH to RUNPATH such that standalone binaries need no LD_LIBRARY_PATH guidance anymore. Fixed in 1.5.1 already.

  • Standalone: Added support for newer skimage. Fixed in 1.5.1 already.

  • Standalone: Fix, new data file type .json needed to be added to the list of extensions used for the Qt plugin bindings. Fixed in 1.5.2 already.

  • Standalone: Fix, the nuitka_types_patch module using during startup was released, which can have bad effects. Fixed in 1.5.2 already.

  • Android: More reliable detection of the Android based Python Flavor. Fixed in 1.5.2 already.

  • Standalone: Added data files for pytorch_lightning and lightning_fabric packages. Added in 1.5.2 already.

  • Windows: Fix, the preservation of PATH didn’t work on systems where this could lead to encoding issues due to reading a MBCS value and writing it as a unicode string. We now read and write the environment value as unicode both. Fixed in 1.5.3 already.

  • Plugins: Fix, the scons report values were not available in case of removed --remove-output deleting it before use. It is now read in case if will be used. Fixed in 1.5.3 already.

  • Python3.11: Added support for ExceptionGroup built-in type. Fixed in 1.5.4 already.

  • Anaconda: Fix, using numpy in a virtualenv and not from conda package was crashing. Fixed in 1.5.4 already.

  • Standalone: Added support for setuptools. Due to the anti-bloat work, we didn’t notice that if that was not sufficiently usable, the compiled result was not usable. Fixed in 1.5.4 already.

  • Distutils: Added support for pyproject with src folders. This supports now tool.setuptools.packages.find with a where value with pyproject files, where it typically is used like this:

    [tool.setuptools.packages.find] where = ["src"]
  • Windows: Fix, the nuitka-run batch file was not working. Fixed in 1.5.4 already.

  • Standalone: Add pymoo implicit dependencies. Fixed in 1.5.5 already.

  • macOS: Avoid deprecated API, this should fix newer Xcode being used. Fixed in 1.5.5 already.

  • Fix, the multiprocessing in spawn mode didn’t handle relative paths that become invalid after process start. Fixed in 1.5.5 already.

  • Fix, spec %CACHE_DIR% was not given the correct folder on non-Windows. Fixed in 1.5.5 already.

  • Fix, special float values like nan and inf didn’t properly generate code for C values. Fixed in 1.5.5 already.

  • Standalone: Add missing DLL for onnxruntime on Linux too. Fixed in 1.5.5 already.

  • UI: Fix, illegal python flags value could enable site mode. by mistake and were not caught. Fixed in 1.5.6 already.

  • Windows: Fix, user names with spaces failed with MinGW64 during linking. Fixed in 1.5.6 already.

  • Linux: Fix, was not excluding all libraries from glibc, which could cause crashes on newer systems. Fixed in 1.5.6 already.

  • Windows: Fix, could still pickup SxS libraries distributed by other software when found in PATH. Fixed in 1.5.6 already.

  • Windows: Fix, do not use cache DLL dependencies if one the files listed there went missing. Fixed in 1.5.6 already.

  • Onefile: Reject path spec that points to a system folder. We do not want to delete those when cleaning up clearly. Added in 1.5.6 already.

  • Plugins: Fix, the dill-compat was broken by code object changes. Fixed in 1.5.6 already.

  • Standalone: Added workaround for networkx decorator issues. Fixed in 1.5.7 already.

  • Standalone: Added workaround for PySide6 problem with disconnecting signals from methods. Fixed in 1.5.7 already.

  • Standalone: Added workaround for PySide2 problem with disconnecting signals.

  • Fix, need to make sure the yaml package is located absolutely or else case insensitive file systems can confuse things. Fixed in 1.5.7 already.

  • Standalone: Fix, extra scan paths were not considered in caching of module imports, breaking the feature in many cases. Fixed in 1.5.7 already.

  • Windows: Fix, avoid system installed appdirs package as it is frequently broken. Fixed in 1.5.7 already.

  • Standalone: The bytecode cache check needs to handle re-checking relative imports found in the cache better. Otherwise some standard library modules were always recompiled due to apparent import changes. Fixed in 1.5.7 already.

  • Nuitka-Python: Fix, do not insist on PYTHONHOME making it to os.environ in order to delete it again. Fixed in 1.5.7 already.

  • Nuitka-Python: Allow builtin modules of all names. This is of course what it does. Fixed in 1.5.7 already.

  • Nuitka-Python: Ignore empty extension module suffix. Was confusing Nuitka to consider every file an extension module potentially. Fixed in 1.5.7 already.

  • Plugins: Properly merge code coming from distinct plugins. The __future__ imports need to be moved to the start. Added in 1.5.7 already.

  • Standalone: Added support for opentele package. Fixed in 1.5.7 already.

  • Standalone: Added support for newer pandas and pyarrow usage. Fixed in 1.5.7 already.

  • Standalone: Added missing implicit dependency for PySide6. Fixed in 1.5.7 already.

  • Fix, the pyi-file parser didn’t handle doc strings, and could be crash for comment contents not conforming to be import statement code. Fixed in 1.5.8 already.

  • Standalone: Added support for pyqtlet2 data files.

  • Python2: Fix, PermissionError doesn’t exist on that version, which could lead to issues with retries for locked files e.g. but was also observed with symlinks.

  • Plugins: Recognize the error given by with upx if a file is already compressed.

  • Fix, so called “fixed” imports were not properly tracking their use, such that they then didn’t show up in reports, and didn’t cause dependencies on the module, which could e.g. impact importlib to not be included even if still being used.

  • Windows: Fix, retries for payload attachment were crashing when maximum number of retries were reached. Using the common code for retries solves that, since that code handles it just fine.

  • Standalone: Added support for the av module.

  • Distutils: Fix, should build from files in build folder rather than source files. This allows tools like versioneer that integrate with setuptools to do their thing, and get the result of that to compilation rather than the original source files.

  • Standalone: Added support for the Equation module.

  • Windows/macOS: Avoid problems with case insensitive file systems. The nuitka.Constants module and nuitka.constants package could collide, so we now avoid that package, there was only what is now nuitka.Serialization in there anyway. Also similar problem with nuitka.utils.Json and json standard library module.

  • Standalone: Added support transformers package.

  • Standalone: Fix for PyQt5 which needs a directory to exist.

  • macOS: Fix, was crashing with PyQt6 in standalone mode when trying to register plugins to non-default path. We now try to skip the need, which also makes it work.

  • Fix, recursion error for complex code that doesn’t happen in ast module, but during conversion of the node tree it gives to our own tree, were not handled, and crashed with RecursionError. This is now also handled, just like the error from ast.

  • Standalone: Added support for sqlfluff.

  • Standalone: Added support for PySide 6.5 on macOS solving DLL dependency issues.

  • Scons: Recognize more ccache outputs properly, their logging changed and provided irrelevant states, and ones not associated so far.

  • Onefile: Fix, could do random exit codes when failing to fork for whatever reason.

  • Standalone: Added support for pysnmp package.

  • Standalone: Added support for torchaudio and tensorflow on macOS. These contain broken DLL dependencies as relative paths, that are apparently ignored by macOS, so we do that too now.

  • Onefile: Use actual rather than guessed standalone binary name for multiprocessing spawns. Without this, a renamed onefile binary, didn’t work.

  • Fix, side effect nodes, that are typically created when an expression raises, were use in optimization contexts, where they do not work.

  • Standalone: Added missing implicit dependency for sentence_transformers package.

New Features
  • Support for Python 3.11 is finally there. This took very long, because there were way more core changes than with previous releases. Nuitka integrates close to that core, and is as such very affected by this. Also a lot of missed opportunities to improve 3.7 or higher, 3.9 or higher, and 3.10 or higher were implemented right away, as they were discovered on the way. Those had core changes not yet taken advantage of and as a result got faster with Nuitka too.

  • Reports: Added option --report-diffable to make the XML report created with --report become usable for comparison across different machine installations, users compiling, etc. so it can be used to compare versions of Nuitka and versions of packages being compiled for changes. Also avoid short names in reports, and resolve them back to long names, so they become more portable too.

  • Reports: Added option to provide custom data from the user. We use it in out testing to record the pipenv state used with things like --report-user-provided=pipenv-lock-hash=64a5e4 with this data ending up inside of reports, where tools like the new testing tool nuitka-watch can use it to decide if upstream packages changed or not. These are free form, just needs to fit XML rules.

  • Plugins: Added include-pyi-file flag to data-files section. If provided, the .pyi file belonging to a specific module is included. Some packages, e.g. skimage depend at runtime on them. For data file options and configuration, these files are excluded, but this is now the way to force their inclusion. Added in 1.5.1 already.

  • Compatibility: Added support for including distribution metadata with new option --include-distribution-metadata.

    This allows generic walks over distributions and their entry points to succeed, as well as version checks with the metadata packages that are not compile time optimized.

  • Distutils: Handle extension modules in build tasks. Also recognize if we built it ourselves, in which case we remove it for rebuild. Added in 1.5.7 already.

  • Linux: Detect DLL like filenames that are Python extension modules, and ignore them when listing DLLs of a package with --list-package-dlls option. So far, this was a manual task to figure out actual DLLs. This will of course improve the Yaml package configuration tooling .

  • Onefile: Allow forcing to use no compression for the onefile payload, useful for debugging, to avoid long compression times and for test coverage of the rare case of not compressing if the bootstrap handles that correctly too.

  • Need to resolve symlinks that were used to call the application binary in some places on macOS at least. We therefore implemented the previously experimental and Windows only feature for all platforms.

  • Standalone: Added support including symlinks on non-Windows in standalone distribution, if they still point to a path that is inside the distribution. This can save a bunch of disk space used for some packages that e.g. distribute DLL links on Linux.

  • Onefile: Added support for including symlinks from the standalone distribution as such on non-Windows. Previously they were resolved to complete copies.

  • UI: Respect code suffixes in package data patterns. With this e.g. --include-package-data=package_name:*.py is doing what you say, even if of course, that might not be working.

  • UI: Added option --edit-module-code option.

    To avoid manually locating code to open it in Visual Code replaced old find-module helper to be a main Nuitka option, where it is more accessible. This also goes beyond it it, such that it resolves standalone file paths to module names to make debugging easier, and that it opens the file right away.

  • Standalone: Added support for handling missing DLLs. Needed for macOS PySide6.5.0 from PyPI, which contains DLL references that are broken. With this feature, we can exclude DLLs that wouldn’t work anyway.

  • macOS: Fix, added missing dependency for platform module.

  • Anti-Bloat: Remove IPython usage in huggingface_hub package versions. Added in 1.5.2 already.

  • Anti-Bloat: Avoid IPython usage in tokenizers module. Added in 1.5.4 already.

  • Added support for module type as a constant value. We want to add all types we have shapes for to allow better type(x) optimization. This is only the start.

  • Onefile: During payload unpacking the memory mapped data was copied to an input buffer. Removing that avoids memory copying and reduces usage.

  • Onefile: Avoid repeated directory creations. Without it, the bootstrap was creating already existing directories up to the root over and over, making many unnecessary file system checks. Added in 1.5.5 already.

  • Anti-Bloat: Remove usage of IPython in trio package. Added in 1.5.6 already.

  • Onefile: Use resource for payload on Win32 rather than overlay. This integrates better with signatures, removing the need to check for original file size. Changed in 1.5.6 already.

  • Onefile: Avoid using zstd input buffer, but using the memory mapped contents directly avoiding to copy uncompressed payload data. Changed in 1.5.6 already.

  • Onefile: Avoid double slashes in expanded onefile temp spec paths, they are just ugly.

  • Anti-Bloat: Remove usage of pytest and IPython for some packages used by newer torch. Added in 1.5.7 already.

  • Anti-Bloat: Avoid triton to use setuptools. Added in 1.5.7 already.

  • Anti-Bloat: Avoid pytest in newer networkx package. Added in 1.5.7 already.

  • Prepare optimization for more built-in types with experimental code, but we need to disable it for now as it requires more completeness in code generation to cover them all. We did some, e.g. module type, but many more will be missing still.

  • Prepare optimization of class selection at compile time, by having a helper function rather than a dedicated node. This work is not complete though, and cannot be activated yet.

  • Windows: Cache short path name resolutions. Esp. for reporting, we now do a lot more of these than before, and this avoids they can become too time consuming.

  • Faster constant value handling for float value checks by avoiding module lookups per value.

  • Minimize size for hello world distribution such that no unused extension modules are included, by excluding even more modules and using modules from automatic inclusion of standard library.

  • Anti-Bloat: Catch pytest namespaces py and _pytest sooner, to point to the actual uses more directly.

  • Anti-Bloat: Usage of doctest equals usage of “unittest” so cover it too, to point to the actual uses more directly.

  • Ever more spelling fixes in code and tests were identified and fixed.

  • Make sure side effect nodes indicate properly that they are raising, allowing exceptions to fully bubble up. This should lead to more dead code being recognized as such.

  • GitHub: Added marketplace action designed to cross platform build with Nuitka on GitHub directly. Usable with both standard and commercial Nuitka versions, and pronouncing it as officially supported.

    Check out out at Nuitka-Action repository.

  • Windows: When MSVC doesn’t have WindowsSDK, just don’t use it, and proceed, to e.g. allow fallback to winlibs gcc.

  • User Manual: The code to update benchmark numbers as giving was actually wrong. Fixed in 1.5.1 already.

  • UI: Make it clear that partially supported versions are considered experimental, not unsupported. Fixed in 1.5.2 already.

  • Plugins: Do not list deprecated plugins with plugin-list, they do not have any effect, but listing them, makes people use them still. Fixed in 1.5.4 already.

  • Plugins: Make sure all plugins have descriptions. Some didn’t have any yet, and sometimes the wording was improved. Fixed in 1.5.4 already.

  • UI: Accept y as a shortcut for yes in prompts. Added in 1.5.5 already.

  • Reports: Make sure the DLL dependencies for Linux are in a stable order. Added in 1.5.6 already.

  • Plugins: Check for latest fixes in PySide6. Added in 1.5.6 already.

  • Windows XP: For Python3.4 make using Python2 scons work again, we cannot have 3.5 or higher there. Added in 1.5.6 already.

  • Quality: Updated to latest PyLint. With Python 3.11 the older one, was not really working, and it was about time. Due to its many changes, we included it in the hotfix, so those can still be done. Changed in 1.5.7 already.

  • Release: Avoid broken requires.txt in source distribution. This apparently breaks poetry. Changed in 1.5.7 already.

  • GitHub: Enhanced issue template for more clarity, esp. to avoid unnecessary options, e.g. using --onefile for issues that show up with --standalone already, to report factory branch issues rather on Discord, and give a quick tip for a likely reproducer if a package fails to import.

  • User Manual: Added instructions on how to add a DLL or executable to a standalone distribution.

  • User Manual: Example paths in the table for path specs, meant for Windows were not properly escaping the backslashes and therefore rendered incorrectly.

  • Visual Code: Python3.11 is now the default configuration for C code editing.

  • Developer Manual: Updated descriptions for adding test suite. While added the Python 3.11 test suite, these instructions were further improved.

  • Debugging: Make it easier to fully deactivate free lists. Now only need to set max size to 0 and the free list will not be used.

  • Debugging: Added more assertions, added corrections to feature disables, check args after function calls for validity, check more types to be as expected.

  • Plugins: Enhanced plugin error messages generally, with --debug exceptions become warning messages with the original exception being raised instead, making debugging during development much easier.

  • UI: Make it clear what not using ccache actually means. Not everybody is familiar with the design of Nuitka there or what the tool can actually do.

  • UI: Do not warn about not found distributions but merely inform of them.

    Since Nuitka is fully compatible with these, no need to consider those a warning, for some packages they also are given really a lot.

  • UI: Catch user error of wrong cases plugin names

    This now points out the proper name rather than denying the existence outright. We do not want to accept wrong case names silently.

  • Use proper API for setting PyConfig values during interpreter initialization. There is otherwise always the risk of crashes, should these values change during runtime. Fixed in 1.5.2 already.

  • For our reformulations have a helper function that build release statements for multiple variables at once. This removed a bunch of repetitve code from re-formulations.

  • Move the pyi-file parser code out of the module nodes and to source handling, where it is more closely related.

  • Adding a nuitka-watch tool, which is still experimental and for use with the Nuitka-Watch repository.

  • Refined macOS standalone exceptions further to cover more normal usages of files on that OS and for frameworks that applications typically use from the system.

  • Detect and consider onefile mode if given in project options as well.

  • Was not really applying import check in programs tests. Added in 1.5.6 already.

  • Added coverage of testing the signing of Windows binaries with the commercial plugin.

  • Added coverage of version information to hello world onefile test, so we can use it for Virus tools checks.

  • Added tests to cover PyQt6 and PySide6 plugin availability, we so far only had that for PyQt5, which is of course not relevant, and totally different code anyway.

  • Cleanup distutils tests case to use common test case scanning. We now decide version skips based on names, and had to get away from number suffixes, so they are now in the middle.


The class bodies optimization has made some progress in this release, going to a re-formulation of the metaclass selection, so as to allow its future optimization. We are not yet at “compiled objects”, but this is a promising road. We need to make some optimization improvements for inlining constant value calls, then this can become really important, but by itself these changes do not yield a lot of improvement.

For macOS again a bunch of time was spent to improve and complete the detection of DLL dependencies. More corner cases are covered now and more packages just work fine as a result.

The most important is to become Python3.11 compatible, even if attribute lookups, and other things, and not yet optimized. We will get to that in future releases. For now, compatibility is the first step to take.

For GitHub users, the Nuitka-Action will be interesting. But it’s still in develop. We keep adding missing options of Nuitka for a while it seems, but for most people it should be usable already.

The new nuitka-watch ability, should allow us to detect breaking PyPI releases, that need a new tweak in Nuitka sooner. But it will probably grow in the coming releases to full value only. For now the tool itself is not yet finished.

From here, a few open ends in the CPython 3.11 test suite will have to be addressed, and maybe some of the performance tricks that it now will enable, e.g. with repeated attribute lookups.

Categories: FLOSS Project Planets

Mike Driscoll: The Python Show Podcast Now on YouTube

Fri, 2023-06-09 09:38

The Python Show Podcast now has a YouTube channel too. You can find the first episode of The Python Show there now:

If you prefer to listen to The Python Show, you can stream it on the following websites:

The post The Python Show Podcast Now on YouTube appeared first on Mouse Vs Python.

Categories: FLOSS Project Planets

Real Python: The Real Python Podcast – Episode #159: Volunteering, Organizing, and Finding a Python Community

Fri, 2023-06-09 08:00

Have you thought about getting more involved in the Python community? Are you interested in volunteering for an event or becoming an organizer? This week on the show, we speak with organizers from this year's PyCascades conference about making connections, learning new skills, and rationing your time.

[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]

Categories: FLOSS Project Planets