Planet Python

Subscribe to Planet Python feed
Planet Python - http://planetpython.org/
Updated: 16 hours 41 min ago

ListenData: 4 Ways to Correct Grammar with Python

Tue, 2024-01-30 18:09

This tutorial explains various methods for checking and correcting grammatical errors using Python. Automatic grammar correction helps students, professionals and content creators to make sure their writing follows proper grammar rules.

To read this article in full, please click hereThis post appeared first on ListenData
Categories: FLOSS Project Planets

PyCoder’s Weekly: Issue #614 (Jan. 30, 2024)

Tue, 2024-01-30 14:30

#614 – JANUARY 30, 2024
View in Browser »

Create a Tic-Tac-Toe Python Game Engine With an AI Player

In this video course, you’ll create a universal game engine in Python for tic-tac-toe with two computer players, one of which will be an AI player using the powerful minimax algorithm. You’ll give your game library a text-based graphical interface and explore two front ends.
REAL PYTHON course

A Priority-Expiry LRU Cache Without Heaps or Trees

This article covers how to implement an LRU cache using only the Python standard library. It does so the hard way, staying away from heaps and binary search trees.
ADRIAN

Snyk’s Ethical Hacking 101 Workshop ⚡ | February 8, 2024 | 11:00am ET

Join Snyk’s Ethical Hacking 101 Workshop and learn how ethical hacking can help you identify security weaknesses in your systems before attackers do 🤖 Get live support in the hands-on lab ✅ as you find and fix vulnerabilities 🛠️ and ✅ learn the process of responsible disclosure. Register today →
SNYK.IO sponsor

Debugging Python

Tips for debugging Python, based on a talk done at PyCon Sweden. Learn how to be better at debugging your Python code!
JUHA-MATTI SNTALA

DSF Calls for Applicants for a Django Fellow

DJANGO SOFTWARE FOUNDATION

Wagtail 5.2.3 Released

WAGTAIL

Articles & Tutorials Ten Python datetime Pitfalls

It’s no secret that the Python datetime library has its quirks. Not only are there probably more than you think, but third-party libraries don’t address most of them! Arie created a new library to explore what a better datetime library could look like.
ARIE BOVENBERG • Shared by Arie Bovenberg

Exploring the Role of Static Methods in Python

Python is a versatile language, supporting multiple programming paradigms, including procedural, object-oriented, and functional programming. This article discusses static methods, but from a functional programming perspective.
BOB BEDERBOS

Performance Analysis of Python’s dict() and {}

This article delves into the details behind the choice of calling dict() or using {} directly in your code. It covers the underlying structures in the interpreter as well as performance.
KAMIL RUSIN

My Django Roadmap Ideas

Thibaud Cola is the newest Django Software Foundation member and this post describes his ideas for the Django roadmap session for 2024. Learn what might be coming in future Django core.
PAOLO MELCHIORRE

Python’s Magic Methods: Leverage Their Power in Your Classes

In this tutorial, you’ll learn what magic methods are in Python, how they work, and how to use them in your custom classes to support powerful features in your object-oriented code.
REAL PYTHON

Filter a list in Python

This tutorial explains the different methods for filtering a list in Python. It shows you how to use the filter() function, list comprehensions, and good old loops.
SOUMYA AGARWAL

Projects & Code datamapplot: Creating Beautiful Plots of Data Maps

GITHUB.COM/TUTTEINSTITUTE

marimo: A Reactive Notebook for Python

GITHUB.COM/MARIMO-TEAM

niquests: Requests but Multiplexed

GITHUB.COM/JAWAH

unidep: Unified Pip and Conda Dependency Management

GITHUB.COM/BASNIJHOLT

falco: CLI and Guides for the Modern Django Developer

GITHUB.COM/TOBI-DE

Events Weekly Real Python Office Hours Q&A (Virtual)

January 31, 2024
REALPYTHON.COM

Canberra Python Meetup

February 1, 2024
MEETUP.COM

Sydney Python User Group (SyPy)

February 1, 2024
SYPY.ORG

Python Devroom @ FOSDEM 2024

February 4 to February 5, 2024
FOSDEM.ORG

Melbourne Python Users Group, Australia

February 5, 2024
J.MP

Happy Pythoning!
This was PyCoder’s Weekly Issue #614.
View in Browser »

[ Subscribe to 🐍 PyCoder’s Weekly 💌 – Get the best Python news, articles, and tutorials delivered to your inbox once a week >> Click here to learn more ]

Categories: FLOSS Project Planets

Real Python: Building Enumerations With Python’s enum

Tue, 2024-01-30 09:00

Some programming languages, such as Java and C++, have built-in support for a data type called enumerations, commonly referred to as enums. Enums enable you to create sets of logically related constants that you can access through the enumeration itself. Unlike these languages, Python doesn’t have a dedicated syntax for enums. However, the Python standard library provides an enum module that offers support for enumerations through the Enum class.

If you’re familiar with enums from other languages and wish to use them in Python, or if you simply want to learn how to work with enumerations, then this video course is designed for you.

In this video course, you’ll discover how to:

  • Create enumerations of constants using Python’s Enum class
  • Interact with enumerations and their members in Python
  • Customize enumeration classes by adding new functionalities
  • Apply practical examples to gain a deeper understanding of the benefits of using enumerations

Additionally, you’ll explore other specific enumeration types available in the enum module, such as IntEnum, IntFlag, and Flag. These specialized enums will expand your repertoire.

To get the most out of this video course, you should be familiar with object-oriented programming and inheritance in Python.

[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]

Categories: FLOSS Project Planets

Awesome Python Applications: Duplicity

Tue, 2024-01-30 05:42

Duplicity: Encrypted bandwidth-efficient backup tool, using the rsync algorithm.

Links:

Categories: FLOSS Project Planets

death and gravity: Limiting concurrency in Python asyncio: the story of async imap_unordered()

Tue, 2024-01-30 03:15

So, you're doing some async stuff, repeatedly, many times.

Like, hundreds of thousands of times.

Maybe you're scraping some data.

Maybe it's more complicated – you're calling an API, and then passing the result to another one, and then saving the result of that.

Either way, it's a good idea to not do it all at once. For one, it's not polite to the services you're calling. For another, it'll load everything in memory, all at once.

In sync code, you might use a thread pool and imap_unordered():

pool = multiprocessing.dummy.Pool(2) for result in pool.imap_unordered(do_stuff, things_to_do): print(result)

Here, concurrency is limited by the fixed number of threads.

But what about async code? In this article, we'll look at a few ways of limiting concurrency in asycio, and find out which one is best.

Tip

No, it's not Semaphore, despite what Stack Overflow may tell you.

Tip

If you're in a hurry – it's wait().

Getting started #

In order to try things out more easily, we'll start with a test harness of sorts.

Imports: .. literalinclude:: mo_00_serial.py :lines: 1-5 -->

Our async map_unordered() behaves pretty much like imap_unordered() – it takes a coroutine function and an iterable of arguments, and runs the resulting awaitables limit at a time:

8 9 10def map_unordered(func, iterable, *, limit): aws = map(func, iterable) return limit_concurrency(aws, limit)

The actual running is done by limit_concurrency(). For now, we run them one by one (we'll get back to this later on):

13 14 15 16 17 18 19 20async def limit_concurrency(aws, limit): aws = iter(aws) while True: try: aw = next(aws) except StopIteration: break yield await aw

To simulate work being done, we just sleep():

23 24 25 26async def do_stuff(i): 1 / i # raises ZeroDivisionError for i == 0 await asyncio.sleep(i) return i

Putting it all together, we get a map_unordered.py LIMIT TIME... script that does stuff in parallel, printing timings as we get each result:

29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44async def async_main(args, limit): start = time.monotonic() async for result in map_unordered(do_stuff, args, limit=limit): print(f"{(time.monotonic() - start):.1f}: {result}") print(f"{(time.monotonic() - start):.1f}: done") def main(): limit = int(sys.argv[1]) args = [float(n) for n in sys.argv[2:]] timeout = sum(args) + 0.1 asyncio.run(asyncio.wait_for(async_main(args, limit), timeout)) if __name__ == '__main__': main()

... like so:

$ python map_unordered.py 2 0.0: done $ python map_unordered.py 2 .1 .2 0.1: 0.1 0.3: 0.2 0.3: done

Tip

If you need a refresher on lower level asyncio stuff related to waiting, check out Hynek Schlawack's excellent Waiting in asyncio.

asyncio.gather() #

In the Running Tasks Concurrently section of the asyncio docs, we find asyncio.gather(), which runs awaitables concurrently and returns their results.

We can use it to run limit-sized batches:

13 14 15 16 17 18 19 20async def limit_concurrency(aws, limit): aws = iter(aws) while True: batch = list(itertools.islice(aws, limit)) if not batch: break for result in await asyncio.gather(*batch): yield result

This seems to work:

$ python map_unordered.py 2 .1 .2 0.2: 0.1 0.2: 0.2 0.2: done

... except:

$ python map_unordered.py 2 .1 .2 .2 .1 0.2: 0.1 0.2: 0.2 0.4: 0.2 0.4: 0.1 0.4: done

... those should fit in 0.3 seconds:

| sleep(.1) | sleep(.2) | | sleep(.2) | sleep(.1) |

... but we're waiting for the entire batch to finish, even if some tasks finish earlier:

| sleep(.1) |...........| sleep(.2) | | sleep(.2) | sleep(.1) |...........| asyncio.Semaphore #

Screw the docs, too much to read; after some googling, the first few Stack Overflow answers all point to asyncio.Semaphore.

Like its threading counterpart, we can use it to limit how many times the body of a with block is entered in parallel:

13 14 15 16 17 18 19 20 21async def limit_concurrency(aws, limit): semaphore = asyncio.Semaphore(limit) async def wrapper(aw): async with semaphore: return await aw for result in await asyncio.gather(*map(wrapper, aws)): yield result

This works:

$ python map_unordered.py 2 .1 .2 .2 .1 0.3: 0.1 0.3: 0.2 0.3: 0.2 0.3: 0.1 0.3: done

... except, because gather() takes a sequence, we end up consuming the entire aws iterable before gather() is even called. Let's highlight this:

37 38 39 40def on_iter_end(it, callback): for x in it: yield x callback() 46 47 48 timeout = sum(args) + 0.1 args = on_iter_end(args, lambda: print("iter end")) asyncio.run(asyncio.wait_for(async_main(args, limit), timeout))

As expected:

$ python map_unordered.py 2 .1 .2 .2 .1 iter end 0.3: 0.1 0.3: 0.2 0.3: 0.2 0.3: 0.1 0.3: done

For small iterables, this is fine, but for bigger ones, creating all the tasks upfront without running them might cause memory issues. Also, if the iterable is lazy (e.g. it comes from a paginated API), we only start work after it's all consumed in memory, instead processing it in a streaming fashion.

asyncio.as_completed() #

At a glance, asyncio.as_completed() might do what we need – it takes an iterable of awaitables, runs them concurrently, and returns an iterator of coroutines that "can be awaited to get the earliest next result from the iterable of the remaining awaitables".

Sadly, it still consumes the iterable right away:

def as_completed(fs, *, timeout=None): ... # set-up todo = {ensure_future(f, loop=loop) for f in set(fs)} ... # actual logic

But there's another, subtler issue.

as_completed() has no limits of its own – it's up to us to limit how fast we feed it awaitables. Presumably, we could wrap the input iterable into a generator that yields awaitables only if enough results came out the other end, and waits otherwise.

However, due to historical reasons, as_completed() takes a plain-old-sync-iterator – we cannot await anything in its (sync) __next__(), and sync waiting of any kind would block (and possibly deadlock) the entire event loop.

So, no as_completed() for you.

asyncio.Queue #

Speaking of threading counterparts, how would you implement imap_unordered() if there was no Pool? Queues, of course!

And asyncio has its own Queue, which you use in pretty much the same way: start limit worker tasks that loop forever, each pulling awaitables, awaiting them, and putting the results into a queue.

13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41async def limit_concurrency(aws, limit): aws = iter(aws) queue = asyncio.Queue() ndone = 0 async def worker(): while True: try: aw = next(aws) except StopIteration: await queue.put((False, None)) break try: await queue.put((True, await aw)) except Exception as e: await queue.put((False, e)) break worker_tasks = [asyncio.create_task(worker()) for _ in range(limit)] while ndone < limit or not queue.empty(): ok, rv = await queue.get() if ok: yield rv elif rv: raise rv else: ndone += 1

The iterable is exhausted before the last "batch" starts:

$ python map_unordered.py 2 .1 .2 .3 .3 .2 .1 0.1: 0.1 0.2: 0.2 0.4: 0.3 0.5: 0.3 iter end 0.6: 0.1 0.6: 0.2 0.6: done

I was going to work up to this in a few steps, but I'll just point out three common bugs this type of code might have (that apply to threads too).

First, we could increment ndone from the worker, but this makes await queue.get() hang forever for empty iterables, since workers never get to run by the time we get to it; because there's no other await, it's not even a race condition. 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33async def limit_concurrency(aws, limit): aws = iter(aws) queue = asyncio.Queue() ndone = 0 async def worker(): nonlocal ndone while True: try: aw = next(aws) except StopIteration: ndone += 1 break await queue.put(await aw) worker_tasks = [asyncio.create_task(worker()) for _ in range(limit)] while ndone < limit or not queue.empty(): yield await queue.get() $ python map_unordered.py 2 iter end Traceback (most recent call last): ... asyncio.exceptions.TimeoutError

The solution is to signal the worker is done in-band, by putting a sentinel on the queue. I guess a good rule of thumb is that you want a put() for each get() without a timeout.1

Second, you have to catch all exceptions; otherwise, the worker gets killed, and get() waits forever for a sentinel that will never come. 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37async def limit_concurrency(aws, limit): aws = iter(aws) queue = asyncio.Queue() ndone = 0 done = object() async def worker(): while True: try: aw = next(aws) except StopIteration: await queue.put(done) break await queue.put(await aw) worker_tasks = [asyncio.create_task(worker()) for _ in range(limit)] while ndone < limit or not queue.empty(): rv = await queue.get() if rv is done: ndone += 1 continue yield rv $ python map_unordered.py 2 .1 .2 0 .2 .1 0.1: 0.1 0.2: 0.2 0.4: 0.2 iter end 0.5: 0.1 Traceback (most recent call last): ... asyncio.exceptions.TimeoutError Task exception was never retrieved future: <Task finished name='Task-3' coro=<limit_concurrency.<locals>.worker() done, defined at map_unordered.py:20> exception=ZeroDivisionError('float division by zero')> Traceback (most recent call last): ... ZeroDivisionError: float division by zero

Finally, our input iterator is synchronous (for now), so no other task can run during next(aws). But if it were async, any number of tasks could await anext(aws) in parallel, leading to concurrency issues. The fix is the same as with threads: either protect that call with a Lock, or feed awaitables to workers through an input queue.

Anyway, no need to worry about any of that – a better solution awaits.

Aside: backpressure #

At this point, we're technically done – the queue solution does everything Pool.​imap_unordered() does.

So much so, that, like imap_unordered(), it lacks backpressure: when code consuming results from map_unordered() cannot keep up with the tasks producing them, the results accumulate in the internal queue, with potentially infinite memory usage.

>>> pool = multiprocessing.dummy.Pool(1) >>> for _ in pool.imap_unordered(print, range(4)): ... time.sleep(.1) ... print('got result') ... 0 1 2 3 got result got result got result got result >>> async def async_print(arg): ... print(arg, '(async)') ... >>> async for _ in map_unordered(async_print, range(4), limit=1): ... await asyncio.sleep(.1) ... print('got result') ... 0 (async) 1 (async) 2 (async) 3 (async) got result got result got result got result

To fix this, we make the queue bounded, so that workers block while the queue is full.

15 queue = asyncio.Queue(limit) >>> async for _ in map_unordered(async_print, range(5), limit=1): ... await asyncio.sleep(.1) ... print('got result') ... 0 (async) 1 (async) 2 (async) got result 3 (async) got result 4 (async) got result got result got result

Alas, we can't do the same thing for Pool.​imap_unordered() because don't have access to its queue, but that's a story for another time.

Each coroutine returned can be awaited to get the earliest next result from the iterable of the remaining awaitables. That is, the waiting happens when you `await` them. This is unlike the original, not-async as_completed() from concurrent.futures: https://docs.python.org/3/library/concurrent.futures.html#concurrent.futures.as_completed .. literalinclude:: sync_as_completed.py :lines: 1- ```console $ python sync_as_completed.py 0.1 0.2 0.3 0.3 ``` Note there's no `future.result()` call here; unlike the async version, the wait happens *inside* the as_completed() iterator. --- If only there was some other way to wait tasks, also inspired by the concurrent.futures API, that actually blocks until they're done. https://docs.python.org/3/library/asyncio-task.html#asyncio.wait .. literalinclude:: mo_46_queue_wait.py :lines: 26-27 ```console $ python map_unordered.py 2 .1 .2 .2 .1 0.1: 0.1 0.2: 0.2 iter end 0.3: 0.1 0.3: 0.2 0.3: done ``` --> asyncio.wait() #

Pretending we're using threads works, but it's not all that idiomatic.

If only there was some sort of low level, select()-like primitive taking a set of tasks and blocking until at least one of them finishes. And of course there is – we've been purposefully avoiding it this entire time – it's asyncio.wait(), and it does exactly that.

By default, it waits until all tasks are completed, which isn't much better than gather().

But, with return_when=​FIRST_COMPLETED, it waits until at least one task is completed. We can use this to keep a limit-sized set of running tasks updated with new tasks as soon as the old ones finish:

13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34async def limit_concurrency(aws, limit): aws = iter(aws) aws_ended = False pending = set() while pending or not aws_ended: while len(pending) < limit and not aws_ended: try: aw = next(aws) except StopIteration: aws_ended = True else: pending.add(asyncio.ensure_future(aw)) if not pending: return done, pending = await asyncio.wait( pending, return_when=asyncio.FIRST_COMPLETED ) while done: yield done.pop()

We change limit_concurrency() to yield awaitables instead of results, so it's more symmetric – awaitables in, awaitables out. map_unordered() then becomes an async generator function, instead of a sync function returning an async generator. This is functionally the same, but does make it a bit more self-documenting.

8 9 10 11async def map_unordered(func, iterable, *, limit): aws = map(func, iterable) async for task in limit_concurrency(aws, limit): yield await task

This implementation has all the properties that the Queue one has:

$ python map_unordered.py 2 .1 .2 .2 .1 0.1: 0.1 0.2: 0.2 0.3: 0.1 0.3: 0.2 iter end 0.3: done

... and backpressure too:

>>> async for _ in map_unordered(async_print, range(4), limit=1): ... await asyncio.sleep(.1) ... print('got result') ... 0 (async) got result 1 (async) got result 2 (async) got result 3 (async) got result Liking this so far? Here's another article you might like:

Running async code from sync in Python asyncio

Async iterables #

OK, but what if we pass map_unordered() an asynchronous iterable? We are talking about async stuff, after all.

This opens up a whole looking-glass world of async iteration: instead of iter() you have aiter(), instead of next() you have anext(), some of them you await, some you don't... Thankfully, we can support both without making things much worse.

And we don't need to be particularly smart about it either; we can just feed the current code an async iterable from main(), and punch our way through the exceptions:

57 58 59async def as_async_iter(it): for x in it: yield x 66 67 68 args = on_iter_end(args, lambda: print("iter end")) args = as_async_iter(args) asyncio.run(asyncio.wait_for(async_main(args, limit), timeout)) $ python map_unordered.py 2 .1 .2 .2 .1 Traceback (most recent call last): ... File "map_unordered.py", line 9, in map_unordered aws = map(func, iterable) TypeError: 'async_generator' object is not iterable

map() doesn't work with async iterables, so we use a generator expression instead.

8 9 10 11 12 13 14 15async def map_unordered(func, iterable, *, limit): try: aws = map(func, iterable) except TypeError: aws = (func(x) async for x in iterable) async for task in limit_concurrency(aws, limit): yield await task

In true easier to ask for forgiveness than permission style, we handle the exception from map() instead of, say, checking if aws is an instance of collections.​abc.​Iterable.

We could wrap aws to always be an async iterable, but limit_concurrency() is useful on it its own, so it's better to support both.

$ python map_unordered.py 2 .1 .2 .2 .1 Traceback (most recent call last): ... File "map_unordered.py", line 19, in limit_concurrency aws = iter(aws) TypeError: 'async_generator' object is not iterable

For async iterables, we need to use aiter():

18 19 20 21 22 23 24async def limit_concurrency(aws, limit): try: aws = aiter(aws) is_async = True except TypeError: aws = iter(aws) is_async = False $ python map_unordered.py 2 .1 .2 .2 .1 Traceback (most recent call last): ... File "map_unordered.py", line 32, in limit_concurrency aw = next(aws) TypeError: 'async_generator' object is not an iterator

... and anext():

30 31 32 33 34 35 36 while len(pending) < limit and not aws_ended: try: aw = await anext(aws) if is_async else next(aws) except StopAsyncIteration if is_async else StopIteration: aws_ended = True else: pending.add(asyncio.ensure_future(aw))

... which unlike aiter(), has to be awaited.

Here's limit_concurrency() in its entire glory: 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45async def limit_concurrency(aws, limit): try: aws = aiter(aws) is_async = True except TypeError: aws = iter(aws) is_async = False aws_ended = False pending = set() while pending or not aws_ended: while len(pending) < limit and not aws_ended: try: aw = await anext(aws) if is_async else next(aws) except StopAsyncIteration if is_async else StopIteration: aws_ended = True else: pending.add(asyncio.ensure_future(aw)) if not pending: return done, pending = await asyncio.wait( pending, return_when=asyncio.FIRST_COMPLETED ) while done: yield done.pop()

Not as clean as before, but it gets the job done:

$ python map_unordered.py 2 .1 .2 .2 .1 0.1: 0.1 0.2: 0.2 0.3: 0.1 0.3: 0.2 iter end 0.3: done

Anyway, that's it for now. Here's the final version of the code.

Learned something new today? Share this with others, it really helps!

Want to know when new articles come out? Subscribe here to get new stuff straight to your inbox!

If you've made it this far, you might like:

Why you should still read the docs

Bonus: exceptions #

OK, so what about exceptions?

A lot of times, you still want to do the rest of the things, even if one fails. Also, you probably want to know which one failed, but the map_unordered() results are not in order, so how could you tell?

The most flexible solution is to let the user handle it just like they would with Pool.​imap_unordered() – by decorating the original function. Here's one way of doing it:

48 49 50 51 52 53 54 55def return_args_and_exceptions(func): return functools.partial(_return_args_and_exceptions, func) async def _return_args_and_exceptions(func, *args): try: return *args, await func(*args) except Exception as e: return *args, e 66 67 68 wrapped = return_args_and_exceptions(do_stuff) async for arg, result in map_unordered(wrapped, args, limit=limit): print(f"{(time.monotonic() - start):.1f}: {arg} -> {result}") $ python map_unordered.py 2 .1 .2 0 .2 .1 0.1: 0.1 -> 0.1 0.1: 0.0 -> float division by zero 0.2: 0.2 -> 0.2 0.3: 0.2 -> 0.2 0.3: 0.1 -> 0.1 iter end 0.3: done Bonus: better decorators? #

Finally, here's a cool thing I learned from the asyncio docs.

When writing decorators, you can use partial() to bind the decorated function to an existing wrapper, instead of always returning a new one. The result is a more descriptive representation:

>>> return_args_and_exceptions(do_stuff) functools.partial(<function _return_args_and_exceptions at 0x10647fd80>, <function do_stuff at 0x10647d8a0>)

Compare with the traditional version:

def return_args_and_exceptions(func): async def wrapper(*args): ... return wrapper >>> return_args_and_exceptions(do_stuff) <function return_args_and_exceptions.<locals>.wrapper at 0x103993560>
  1. Does this have a fancy, academic name? Do let me know! [return]

Categories: FLOSS Project Planets

Python Bytes: #369 The Readability Episode

Tue, 2024-01-30 03:00
<strong>Topics covered in this episode:</strong><br> <ul> <li><a href="https://github.com/emmett-framework/granian"><strong>Granian</strong></a></li> <li><a href="https://pythontest.com/pytest/pytest-8-is-here/"><strong>pytest 8 is here</strong></a></li> <li><strong>Assorted Docker Goodies</strong></li> <li><a href="https://visualstudiomagazine.com/articles/2024/01/25/copilot-research.aspx"><strong>New GitHub Copilot Research Finds 'Downward Pressure on Code Quality'</strong></a></li> <li><strong>Extras</strong></li> <li><strong>Joke</strong></li> </ul><a href='https://www.youtube.com/watch?v=xSdv-Txpkg0' style='font-weight: bold;'data-umami-event="Livestream-Past" data-umami-event-episode="369">Watch on YouTube</a><br> <p><strong>About the show</strong></p> <p>Sponsored by us! Support our work through:</p> <ul> <li>Our <a href="https://training.talkpython.fm/"><strong>courses at Talk Python Training</strong></a></li> <li><a href="https://courses.pythontest.com/p/the-complete-pytest-course"><strong>The Complete pytest Course</strong></a></li> <li><a href="https://www.patreon.com/pythonbytes"><strong>Patreon Supporters</strong></a></li> </ul> <p><strong>Connect with the hosts</strong></p> <ul> <li>Michael: <a href="https://fosstodon.org/@mkennedy"><strong>@mkennedy@fosstodon.org</strong></a></li> <li>Brian: <a href="https://fosstodon.org/@brianokken"><strong>@brianokken@fosstodon.org</strong></a></li> <li>Show: <a href="https://fosstodon.org/@pythonbytes"><strong>@pythonbytes@fosstodon.org</strong></a></li> </ul> <p>Join us on YouTube at <a href="https://pythonbytes.fm/stream/live"><strong>pythonbytes.fm/live</strong></a> to be part of the audience. Usually Tuesdays at 11am PT. Older video versions available there too.</p> <p><strong>Michael #1:</strong> <a href="https://github.com/emmett-framework/granian"><strong>Granian</strong></a></p> <ul> <li>via Andy Shapiro and Bill Crook</li> <li>A Rust HTTP server for Python applications.</li> <li>Granian design goals are: <ul> <li>Have a single, correct HTTP implementation, supporting versions 1, 2 (and eventually 3)</li> <li>Provide a single package for several platforms</li> <li>Avoid the usual Gunicorn + uvicorn + http-tools dependency composition on unix systems</li> <li>Provide stable <a href="https://github.com/emmett-framework/granian/blob/master/benchmarks/README.md">performance</a> when compared to existing alternatives</li> </ul></li> <li>Could use better logging <ul> <li>But <a href="https://github.com/emmett-framework/granian/issues/152">making my own</a> taught me maybe I prefer that!</li> </ul></li> <li>Originates from the <a href="https://emmett.sh">Emmett framework</a>.</li> </ul> <p><strong>Brian #2:</strong> <a href="https://pythontest.com/pytest/pytest-8-is-here/"><strong>pytest 8 is here</strong></a></p> <ul> <li>Improved diffs: <ul> <li>Very verbose <code>-vv</code> is a colored diff, instead of a big chunk of red.</li> <li>Python code in error reports is now syntax-highlighted as Python.</li> <li>The sections in the error reports are now better separated.</li> <li>Diff for standard library container types are improved.</li> <li>Added more comprehensive set assertion rewrites for comparisons other than equality ==, with the following operations now providing better failure messages: !=, &lt;=, &gt;=, &lt;, and &gt;.</li> </ul></li> <li>Improvements to -r for xfailures and xpasses <ul> <li>Report tracebacks for xfailures when -rx is set.</li> <li>Report captured output for xpasses when -rX is set.</li> <li>For xpasses, add - in summary between test name and reason, to match how xfail is displayed.</li> <li><em>This one was important to me. Massively helps when checking/debugging xfail/xpass outcomes in CI. Thanks to Fabian Sturm, Bruno Oliviera, and Ran Benita for help to get this release.</em></li> </ul></li> <li>Lots of other improvements</li> <li>See full <a href="https://docs.pytest.org/en/stable/changelog.html">changelog</a> for all the juicy details. And then upgrade and try it out! </li> <li><code>pip install -U pytest</code></li> </ul> <p><strong>Michael #3:</strong> <strong>Assorted Docker Goodies</strong></p> <ul> <li><a href="https://orbstack.dev">OrbStack</a> <ul> <li>Say goodbye to slow, clunky containers and VMs</li> <li>OrbStack is the fast, light, and easy way to run Docker containers and Linux. Develop at lightspeed with our Docker Desktop alternative.</li> </ul></li> <li><a href="https://podman.io">Podman</a> <ul> <li>Podman is an open source container, pod, and container image management engine. Podman makes it easy to find, run, build, and share containers. <ul> <li><strong>Manage containers (not just Podman.)</strong></li> <li>Podman Desktop allows you to list, view, and manage containers from multiple supported container engines* in a single unified view.</li> <li>Gain easy access to a shell inside the container, logs, and basic controls.</li> <li>Works on Podman, Docker, Lima, kind, Red Hat OpenShift, Red Hat OpenShift Developer Sandbox.</li> </ul></li> </ul></li> <li><a href="https://casaos.io">CasaOS</a> <ul> <li>Your Personal Cloud OS.</li> <li>Community-based open source software focused on delivering simple personal cloud experience around Docker ecosystem.</li> <li>Also have the <a href="https://www.kickstarter.com/projects/icewhaletech/zimacube-personal-cloud-re-invented">ZimaCube</a> hardware (Personal cloud. Re-invented.)</li> </ul></li> </ul> <p><strong>Brian #4:</strong> <a href="https://visualstudiomagazine.com/articles/2024/01/25/copilot-research.aspx"><strong>New GitHub Copilot Research Finds 'Downward Pressure on Code Quality'</strong></a></p> <ul> <li>David Ramel</li> <li>Regarding “…the quality and maintainability of AI-assisted code compared to what would have been written by a human.”</li> <li>Q: "Is it more similar to the careful, refined contributions of a Senior Developer, or more akin to the disjointed work of a short-term contractor?"</li> <li>A: <em>"We find disconcerting trends for maintainability. Code churn -- the percentage of lines that are reverted or updated less than two weeks after being authored -- is projected to double in 2024 compared to its 2021, pre-AI baseline. We further find that the percentage of 'added code' and 'copy/pasted code' is increasing in proportion to 'updated,' 'deleted,' and 'moved 'code. In this regard, AI-generated code resembles an itinerant contributor, prone to violate the DRY-ness [don't repeat yourself] of the repos visited."</em></li> </ul> <p><strong>Extras</strong> </p> <p>Brian:</p> <ul> <li>Did I mention pytest 8? Just <code>pip install -U pytest</code> today</li> <li>And if you want to learn pytest super fast, check out <a href="https://courses.pythontest.com/p/the-complete-pytest-course"><strong>The Complete pytest Course</strong></a> or grab a copy of the book, <a href="https://pythontest.com/pytest-book/">Python Testing with pytest</a></li> </ul> <p>Michael:</p> <ul> <li>I’d like to encourage people to join our mailing list. We have some fun plans and some of them involve <a href="https://pythonbytes.fm/friends-of-the-show">our newsletter</a>. It’s super private, no third parties, no spam and is based on my recent Docker and <a href="https://github.com/mikeckennedy/listmonk">Listmonk work</a>.</li> <li><a href="https://github.com/pydantic/pydantic/releases/tag/v2.6.0">Big release for Pydantic</a>, 2.6.</li> <li>New essay: <a href="https://mkennedy.codes/posts/you-should-use-custom-search-engines-way-more/">Use Custom Search Engines Way More</a></li> </ul> <p><strong>Joke:</strong> </p> <ul> <li><a href="https://devhumor.com/media/better-call-saul">Pushing to main</a></li> <li><a href="https://packmates.org/@draconigen/111805683031473439?utm_source=pocket_saves">Junior vs Senior engineer</a></li> </ul>
Categories: FLOSS Project Planets

Erik Marsja: Pandas: Cumulative Sum by Group

Tue, 2024-01-30 02:32

The post Pandas: Cumulative Sum by Group appeared first on Erik Marsja.

In this post, we learn how to use Pandas to calculate a cumulative sum by group, a sometimes important operation in data analysis. Consider a scenario in cognitive psychology research where researchers often analyze participants’ responses over multiple trials or conditions. Calculating the cumulative sum by group may be important to understand the evolving trends or patterns within specific experimental groups. For instance, tracking the cumulative reaction times or accuracy rates across different experimental conditions can unveil insightful patterns. These patterns, in turn, can shed light on cognitive processes.

Pandas, a widely used data manipulation library in Python, simplifies this process, providing an effective mechanism for computing cumulative sums within specific groups. We will see how this functionality streamlines complex calculations as we get into the examples. Pandas enhance our ability to draw meaningful insights from grouped data in diverse analytical contexts.

Table of Contents Outline

The structure of the current post is as follows. First, we quickly look at what you need to follow the post. Next, we had a brief overview of cumulative sum in Pandas. Here, we introduce the cumsum() function. Next, we created a practice dataset and calculated the cumulative sum using Pandas cumsum() on this. First, without grouping, then we moved into more advanced applications with cumulative sums by group, exploring examples that illustrate its versatility and practical use in data analysis. We conclude by summarizing key takeaways.

Prerequisites

Before we explore the cumulative sum by group in Pandas, ensure you have a basic knowledge of Python and Pandas. If not installed, consider adding the necessary libraries to your Python environment to follow along seamlessly (i.e., Panda). Familiarity with groupby operations in Pandas will be particularly beneficial. The cumulative sum operation often involves grouping data based on specific criteria.

Understanding Cumulative Sum

Understanding cumulative sum can be important in data analysis. This especially true when exploring trends, aggregating data, or tracking accumulative changes over time. Cumulative sum, or cumsum, is a mathematical concept involving progressively adding up a sequence of numbers. In Pandas, this operation is simplified using the cumsum() function.

Syntax of Pandas cumsum()

The cumsum() function in Pandas has several parameters that enables some customization based on specific requirements:

  1. axis: Specifies the axis along which the cumulative sum should be computed. The default is None, indicating the operation is performed on the flattened array.
  2. skipna: A Boolean value that determines whether to exclude NaN values during the computation. If set to True (default), NaN values are ignored, while if set to False, they are treated as valid input for the sum.
  3. *args, **kwargs: Additional arguments and keyword arguments that can be passed to customize the function’s behavior further.

Understanding these parameters is important to customize the cumulative sum operation to our specific needs, providing flexibility in dealing with different data types and scenarios.

Before learning how to do the group-specific cumulative sum, let us explore how to perform a basic cumulative sum without grouping. This foundational knowledge will serve as a stepping stone for our subsequent exploration of the cumulative sum by the group in Pandas. But first, we will create some data to practice.

Synthetic Data

Let us create a small sample dataset using Pandas to practice cumulative sum.

import pandas as pd import numpy as np # Create a sample dataframe with a grouping variable data = { 'Participant_ID': [1, 1, 1, 2, 2, 2, 3, 3, 3], 'Hearing_Status': ['Normal', 'Normal', 'Normal', 'Impaired', 'Impaired', 'Impaired', 'Normal', 'Normal', 'Normal'], 'Task': ['Reading Span', 'Operation Span', 'Digit Span'] * 3, 'Trial': [1, 2, 3] * 3, 'WM_Score': [8, 15, 4, 12, np.nan, 7, 9, 10, 8], 'Speech_Recognition_Score': [75, 82, 68, np.nan, 90, 76, 88, 85, np.nan] } df = pd.DataFrame(data)Code language: Python (python)

This dataset simulates cognitive psychology tests where participants undergo different tasks (reading, operation, digit span) over multiple trials, with associated working memory (WM) and speech recognition scores. Some scores intentionally include NaN values to demonstrate handling missing data.

The dataframe structure is organized with columns for ‘Participant_ID’, ‘Task’, ‘Trial’, ‘WM_Score’, and ‘Speech_Recognition_Score’. We also have the grouping variable ‘Hearing_Status’. Each row represents a participant’s performance in a specific task during a particular trial.

This dataset will be the basis for practicing using Pandas to calculate cumulative sum by group. First, however, we will just learn how to use the cumsum() function.

Using Pandas to Calculate Cumulative Sum

Here is an example of using Pandas cumsum() without grouping:

# Calculate cumulative sum without grouping df['Cumulative_WM_Score'] = df['WM_Score'].cumsum() df['Cumulative_SPIN_Score'] = df['Speech_Recognition_Score'].cumsum()Code language: Python (python)

In the code chunk above, we used the cumsum() function from Pandas to compute the cumulative sum of the ‘WM_Score’ and ‘Speech_Recognition_Score’ columns in the dataframe. The .cumsum() method is applied directly to the selected columns, creating new columns, ‘Cumulative_WM_Score’ and ‘Cumulative_Speech_Recognition_Score’. This operation calculates the running total of the scores across all rows in the dataset. Here are the rows 2 to 7 selected with Pandas iloc and the five first rows printed:

Pandas Cumulative Sum by Group: Examples Example 1: Cumulative Sum by Group with One Column

Let us start by looking at the basic application of cumulative sum within a group for a single column using Pandas. This example will consider the cumulative sum of working memory scores (‘WM_Score’) within the different groups.

df['Cum_WM_Score'] = df.groupby('Hearing_Status')['WM_Score'].cumsum()Code language: Python (python)

In the code chunk above, we are using Pandas to create a new column, ‘Cum_WM_Score,’ in the DataFrame df. This new column will contain the cumulative sum of the ‘WM_Score’ column within each group defined by the ‘Hearing_Status’ column. The groupby() function is employed to group the data by the ‘Hearing_Status’ column, and then cumsum() is applied to calculate the cumulative sum for each group separately. The result is a dataframe with the original columns and the newly added ‘Cum_WM_Score’ column, capturing the cumulative sum of working memory scores within each hearing status group.

Example 2: Cumulative Sum by Group with Multiple Columns

Expanding on the concept, we can compute the cumulative sum for multiple columns within groups:

cols_to_cumsum = ['WM_Score', 'Speech_Recognition_Score'] df[cols_to_cumsum] = df.groupby('Hearing_Status')[cols_to_cumsum].cumsum()Code language: Python (python)

In the code snippet above, we again used Pandas to perform a cumulative sum on selected columns (i.e., ‘WM_Score’ and ‘Speech_Recognition_Score’) within each group. This is an extension of the concept introduced in Example 1, where we applied cumsum() on a single column within groups.

Here, we use the groupby() function to group the data by the ‘Hearing_Status’ column and then apply cumsum() to the specified columns using cols_to_cumsum. The result is an updated dataframe df with cumulative sums calculated for the chosen columns within each hearing status group.

Summary

In this post, we looked at using Pandas to calculate cumulative sums by group, a crucial operation in data analysis. Starting with a foundational understanding of cumulative sums and their relevance, we explored the basic cumsum() function. The introduction of group-specific calculations brought us to Example 1, showcasing how to compute cumulative sums within a group for a single column. Building on this, Example 2 extended the concept to multiple columns, demonstrating the versatility of Pandas’ cumulative sum by group.

We navigated through the syntax and application of the cumsum() function, gaining insights into handling missing values and edge cases. Working with a sample dataset inspired by cognitive psychology, we looked at practical scenarios for cumulative sum by group. The approach used in Examples 1 and 2 provides a foundation for applying custom aggregation functions and tackling diverse challenges within grouped data.

Feel free to share this tutorial on social media, and if you find this post valuable for your reports or papers, include the link for others to benefit!

Resources

The post Pandas: Cumulative Sum by Group appeared first on Erik Marsja.

Categories: FLOSS Project Planets

Anarcat: router archeology: the Soekris net5001

Mon, 2024-01-29 23:20

Roadkiller was a Soekris net5501 router I used as my main gateway between 2010 and 2016 (for réseau and téléphone).

It was upgraded to FreeBSD 8.4-p12 (2014-06-06) and pkgng. It was retired in favor of octavia around 2016.

Roughly 10 years later (2024-01-24), I found it in a drawer and, to my surprised, it booted. After wrangling with a RS-232 USB adapter, a null modem cable, and bit rates, I even logged in:

comBIOS ver. 1.33 20070103 Copyright (C) 2000-2007 Soekris Engineering. net5501 0512 Mbyte Memory CPU Geode LX 500 Mhz Pri Mas WDC WD800VE-00HDT0 LBA Xlt 1024-255-63 78 Gbyte Slot Vend Dev ClassRev Cmd Stat CL LT HT Base1 Base2 Int ------------------------------------------------------------------- 0:01:2 1022 2082 10100000 0006 0220 08 00 00 A0000000 00000000 10 0:06:0 1106 3053 02000096 0117 0210 08 40 00 0000E101 A0004000 11 0:07:0 1106 3053 02000096 0117 0210 08 40 00 0000E201 A0004100 05 0:08:0 1106 3053 02000096 0117 0210 08 40 00 0000E301 A0004200 09 0:09:0 1106 3053 02000096 0117 0210 08 40 00 0000E401 A0004300 12 0:20:0 1022 2090 06010003 0009 02A0 08 40 80 00006001 00006101 0:20:2 1022 209A 01018001 0005 02A0 08 00 00 00000000 00000000 0:21:0 1022 2094 0C031002 0006 0230 08 00 80 A0005000 00000000 15 0:21:1 1022 2095 0C032002 0006 0230 08 00 00 A0006000 00000000 15 4 Seconds to automatic boot. Press Ctrl-P for entering Monitor. ������������������������������������������� � � � � ______ � � | ____| __ ___ ___ � Welcome to FreeBSD! � | |__ | '__/ _ \/ _ \ � � | __|| | | __/ __/ � � | | | | | | | � 1. Boot FreeBSD [default] � |_| |_| \___|\___| � 2. Boot FreeBSD with ACPI enabled � ____ _____ _____ � 3. Boot FreeBSD in Safe Mode � | _ \ / ____| __ \ � 4. Boot FreeBSD in single user mode � | |_) | (___ | | | | � 5. Boot FreeBSD with verbose logging � | _ < \___ \| | | | � 6. Escape to loader prompt � | |_) |____) | |__| | � 7. Reboot � | | | | � � |____/|_____/|_____/ � � � � � � � Select option, [Enter] for default � � or [Space] to pause timer 5 � ������������������������������������������� Copyright (c) 1992-2013 The FreeBSD Project. Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 The Regents of the University of California. All rights reserved. FreeBSD is a registered trademark of The FreeBSD Foundation. FreeBSD 8.4-RELEASE-p12 #5: Fri Jun 6 02:43:23 EDT 2014 root@roadkiller.anarc.at:/usr/obj/usr/src/sys/ROADKILL i386 gcc version 4.2.2 20070831 prerelease [FreeBSD] Timecounter "i8254" frequency 1193182 Hz quality 0 CPU: Geode(TM) Integrated Processor by AMD PCS (499.90-MHz 586-class CPU) Origin = "AuthenticAMD" Id = 0x5a2 Family = 5 Model = a Stepping = 2 Features=0x88a93d<FPU,DE,PSE,TSC,MSR,CX8,SEP,PGE,CMOV,CLFLUSH,MMX> AMD Features=0xc0400000<MMX+,3DNow!+,3DNow!> real memory = 536870912 (512 MB) avail memory = 506445824 (482 MB) kbd1 at kbdmux0 K6-family MTRR support enabled (2 registers) ACPI Error: A valid RSDP was not found (20101013/tbxfroot-309) ACPI: Table initialisation failed: AE_NOT_FOUND ACPI: Try disabling either ACPI or apic support. cryptosoft0: <software crypto> on motherboard pcib0 pcibus 0 on motherboard pci0: <PCI bus> on pcib0 Geode LX: Soekris net5501 comBIOS ver. 1.33 20070103 Copyright (C) 2000-2007 pci0: <encrypt/decrypt, entertainment crypto> at device 1.2 (no driver attached) vr0: <VIA VT6105M Rhine III 10/100BaseTX> port 0xe100-0xe1ff mem 0xa0004000-0xa00040ff irq 11 at device 6.0 on pci0 vr0: Quirks: 0x2 vr0: Revision: 0x96 miibus0: <MII bus> on vr0 ukphy0: <Generic IEEE 802.3u media interface> PHY 1 on miibus0 ukphy0: none, 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto, auto-flow vr0: Ethernet address: 00:00:24:cc:93:44 vr0: [ITHREAD] vr1: <VIA VT6105M Rhine III 10/100BaseTX> port 0xe200-0xe2ff mem 0xa0004100-0xa00041ff irq 5 at device 7.0 on pci0 vr1: Quirks: 0x2 vr1: Revision: 0x96 miibus1: <MII bus> on vr1 ukphy1: <Generic IEEE 802.3u media interface> PHY 1 on miibus1 ukphy1: none, 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto, auto-flow vr1: Ethernet address: 00:00:24:cc:93:45 vr1: [ITHREAD] vr2: <VIA VT6105M Rhine III 10/100BaseTX> port 0xe300-0xe3ff mem 0xa0004200-0xa00042ff irq 9 at device 8.0 on pci0 vr2: Quirks: 0x2 vr2: Revision: 0x96 miibus2: <MII bus> on vr2 ukphy2: <Generic IEEE 802.3u media interface> PHY 1 on miibus2 ukphy2: none, 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto, auto-flow vr2: Ethernet address: 00:00:24:cc:93:46 vr2: [ITHREAD] vr3: <VIA VT6105M Rhine III 10/100BaseTX> port 0xe400-0xe4ff mem 0xa0004300-0xa00043ff irq 12 at device 9.0 on pci0 vr3: Quirks: 0x2 vr3: Revision: 0x96 miibus3: <MII bus> on vr3 ukphy3: <Generic IEEE 802.3u media interface> PHY 1 on miibus3 ukphy3: none, 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto, auto-flow vr3: Ethernet address: 00:00:24:cc:93:47 vr3: [ITHREAD] isab0: <PCI-ISA bridge> at device 20.0 on pci0 isa0: <ISA bus> on isab0 atapci0: <AMD CS5536 UDMA100 controller> port 0x1f0-0x1f7,0x3f6,0x170-0x177,0x376,0xe000-0xe00f at device 20.2 on pci0 ata0: <ATA channel> at channel 0 on atapci0 ata0: [ITHREAD] ata1: <ATA channel> at channel 1 on atapci0 ata1: [ITHREAD] ohci0: <OHCI (generic) USB controller> mem 0xa0005000-0xa0005fff irq 15 at device 21.0 on pci0 ohci0: [ITHREAD] usbus0 on ohci0 ehci0: <AMD CS5536 (Geode) USB 2.0 controller> mem 0xa0006000-0xa0006fff irq 15 at device 21.1 on pci0 ehci0: [ITHREAD] usbus1: EHCI version 1.0 usbus1 on ehci0 cpu0 on motherboard pmtimer0 on isa0 orm0: <ISA Option ROM> at iomem 0xc8000-0xd27ff pnpid ORM0000 on isa0 atkbdc0: <Keyboard controller (i8042)> at port 0x60,0x64 on isa0 atkbd0: <AT Keyboard> irq 1 on atkbdc0 kbd0 at atkbd0 atkbd0: [GIANT-LOCKED] atkbd0: [ITHREAD] atrtc0: <AT Real Time Clock> at port 0x70 irq 8 on isa0 ppc0: parallel port not found. uart0: <16550 or compatible> at port 0x3f8-0x3ff irq 4 flags 0x10 on isa0 uart0: [FILTER] uart0: console (19200,n,8,1) uart1: <16550 or compatible> at port 0x2f8-0x2ff irq 3 on isa0 uart1: [FILTER] Timecounter "TSC" frequency 499903982 Hz quality 800 Timecounters tick every 1.000 msec IPsec: Initialized Security Association Processing. usbus0: 12Mbps Full Speed USB v1.0 usbus1: 480Mbps High Speed USB v2.0 ad0: 76319MB <WDC WD800VE-00HDT0 09.07D09> at ata0-master UDMA100 ugen0.1: <AMD> at usbus0 uhub0: <AMD OHCI root HUB, class 9/0, rev 1.00/1.00, addr 1> on usbus0 ugen1.1: <AMD> at usbus1 uhub1: <AMD EHCI root HUB, class 9/0, rev 2.00/1.00, addr 1> on usbus1 GEOM: ad0s1: geometry does not match label (255h,63s != 16h,63s). uhub0: 4 ports with 4 removable, self powered Root mount waiting for: usbus1 Root mount waiting for: usbus1 uhub1: 4 ports with 4 removable, self powered Trying to mount root from ufs:/dev/ad0s1a

The last log rotation is from 2016:

[root@roadkiller /var/log]# stat /var/log/wtmp 65 61783 -rw-r--r-- 1 root wheel 208219 1056 "Nov 1 05:00:01 2016" "Jan 18 22:29:16 2017" "Jan 18 22:29:16 2017" "Nov 1 05:00:01 2016" 16384 4 0 /var/log/wtmp

Interestingly, I switched between eicat and teksavvy on December 11th. Which year? Who knows!

Dec 11 16:38:40 roadkiller mpd: [eicatL0] LCP: authorization successful Dec 11 16:41:15 roadkiller mpd: [teksavvyL0] LCP: authorization successful

Never realized those good old logs had a "oh dear forgot the year" issue (that's something like Y2K except just "Y", I guess).

That was probably 2015, because the log dates from 2017, and the last entry is from November of the year after the above:

[root@roadkiller /var/log]# stat mpd.log 65 47113 -rw-r--r-- 1 root wheel 193008 71939195 "Jan 18 22:39:18 2017" "Jan 18 22:39:59 2017" "Jan 18 22:39:59 2017" "Apr 2 10:41:37 2013" 16384 140640 0 mpd.log

It looks like the system was installed in 2010:

[root@roadkiller /var/log]# stat / 63 2 drwxr-xr-x 21 root wheel 2120 512 "Jan 18 22:34:43 2017" "Jan 18 22:28:12 2017" "Jan 18 22:28:12 2017" "Jul 18 22:25:00 2010" 16384 4 0 /

... so it lived for about 6 years, but still works after almost 14 years, which I find utterly amazing.

Another amazing thing is that there's tuptime installed on that server! That is a software I thought I discovered later and then sponsored in Debian, but turns out I was already using it then!

[root@roadkiller /var]# tuptime System startups: 19 since 21:20:16 11/07/15 System shutdowns: 0 ok - 18 bad System uptime: 85.93 % - 1 year, 11 days, 10 hours, 3 minutes and 36 seconds System downtime: 14.07 % - 61 days, 15 hours, 22 minutes and 45 seconds System life: 1 year, 73 days, 1 hour, 26 minutes and 20 seconds Largest uptime: 122 days, 9 hours, 17 minutes and 6 seconds from 08:17:56 02/02/16 Shortest uptime: 5 minutes and 4 seconds from 21:55:00 01/18/17 Average uptime: 19 days, 19 hours, 28 minutes and 37 seconds Largest downtime: 57 days, 1 hour, 9 minutes and 59 seconds from 20:45:01 11/22/16 Shortest downtime: -1 years, 364 days, 23 hours, 58 minutes and 12 seconds from 22:30:01 01/18/17 Average downtime: 3 days, 5 hours, 51 minutes and 43 seconds Current uptime: 18 minutes and 23 seconds since 22:28:13 01/18/17

Actual up/down times:

[root@roadkiller /var]# tuptime -t No. Startup Date Uptime Shutdown Date End Downtime 1 21:20:16 11/07/15 1 day, 0 hours, 40 minutes and 12 seconds 22:00:28 11/08/15 BAD 2 minutes and 37 seconds 2 22:03:05 11/08/15 1 day, 9 hours, 41 minutes and 57 seconds 07:45:02 11/10/15 BAD 3 minutes and 24 seconds 3 07:48:26 11/10/15 20 days, 2 hours, 41 minutes and 34 seconds 10:30:00 11/30/15 BAD 4 hours, 50 minutes and 21 seconds 4 15:20:21 11/30/15 19 minutes and 40 seconds 15:40:01 11/30/15 BAD 6 minutes and 5 seconds 5 15:46:06 11/30/15 53 minutes and 55 seconds 16:40:01 11/30/15 BAD 1 hour, 1 minute and 38 seconds 6 17:41:39 11/30/15 6 days, 16 hours, 3 minutes and 22 seconds 09:45:01 12/07/15 BAD 4 days, 6 hours, 53 minutes and 11 seconds 7 16:38:12 12/11/15 50 days, 17 hours, 56 minutes and 49 seconds 10:35:01 01/31/16 BAD 10 minutes and 52 seconds 8 10:45:53 01/31/16 1 day, 21 hours, 28 minutes and 16 seconds 08:14:09 02/02/16 BAD 3 minutes and 48 seconds 9 08:17:56 02/02/16 122 days, 9 hours, 17 minutes and 6 seconds 18:35:02 06/03/16 BAD 10 minutes and 16 seconds 10 18:45:18 06/03/16 29 days, 17 hours, 14 minutes and 43 seconds 12:00:01 07/03/16 BAD 12 minutes and 34 seconds 11 12:12:35 07/03/16 31 days, 17 hours, 17 minutes and 26 seconds 05:30:01 08/04/16 BAD 14 minutes and 25 seconds 12 05:44:26 08/04/16 15 days, 1 hour, 55 minutes and 35 seconds 07:40:01 08/19/16 BAD 6 minutes and 51 seconds 13 07:46:52 08/19/16 7 days, 5 hours, 23 minutes and 10 seconds 13:10:02 08/26/16 BAD 3 minutes and 45 seconds 14 13:13:47 08/26/16 27 days, 21 hours, 36 minutes and 14 seconds 10:50:01 09/23/16 BAD 2 minutes and 14 seconds 15 10:52:15 09/23/16 60 days, 10 hours, 52 minutes and 46 seconds 20:45:01 11/22/16 BAD 57 days, 1 hour, 9 minutes and 59 seconds 16 21:55:00 01/18/17 5 minutes and 4 seconds 22:00:04 01/18/17 BAD 11 minutes and 15 seconds 17 22:11:19 01/18/17 8 minutes and 42 seconds 22:20:01 01/18/17 BAD 1 minute and 20 seconds 18 22:21:21 01/18/17 8 minutes and 40 seconds 22:30:01 01/18/17 BAD -1 years, 364 days, 23 hours, 58 minutes and 12 seconds 19 22:28:13 01/18/17 20 minutes and 17 seconds

The last few entries are actually the tests I'm running now, it seems this machine thinks we're now on 2017-01-18 at ~22:00, while we're actually 2024-01-24 at ~12:00 local:

Wed Jan 18 23:05:38 EST 2017 FreeBSD/i386 (roadkiller.anarc.at) (ttyu0) login: root Password: Jan 18 23:07:10 roadkiller login: ROOT LOGIN (root) ON ttyu0 Last login: Wed Jan 18 22:29:16 on ttyu0 Copyright (c) 1992-2013 The FreeBSD Project. Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 The Regents of the University of California. All rights reserved. FreeBSD 8.4-RELEASE-p12 (ROADKILL) #5: Fri Jun 6 02:43:23 EDT 2014 Reminders: * commit stuff in /etc * reload firewall (in screen!): pfctl -f /etc/pf.conf ; sleep 1 * vim + syn on makes pf.conf more readable * monitoring the PPPoE uplink: tail -f /var/log/mpd.log Current problems: * sometimes pf doesn't start properly on boot, if pppoe failed to come up, use this to resume: /etc/rc.d/pf start it will kill your shell, but fix NAT (2012-08-10) * babel fails to start on boot (2013-06-15): babeld -D -g 33123 tap0 vr3 * DNS often fails, tried messing with unbound.conf (2014-10-05) and updating named.root (2016-01-28) and performance tweaks (ee63689) * asterisk and mpd4 are deprecated and should be uninstalled when we're sure their replacements (voipms + ata and mpd5) are working (2015-01-13) * if IPv6 fails, it's because netblocks are not being routed upstream. DHCPcd should do this, but doesn't start properly, use this to resume (2015-12-21): /usr/local/sbin/dhcpcd -6 --persistent --background --timeout 0 -C resolv.conf ng0 This machine is doomed to be replaced with the new omnia router, Indiegogo campaign should ship in april 2016: http://igg.me/at/turris-omnia/x

(I really like the motd I left myself there. In theory, I guess this could just start connecting to the internet again if I still had the same PPPoE/ADSL link I had almost a decade ago; obviously, I do not.)

Not sure how the system figured the 2017 time: the onboard clock itself believes we're in 1980, so clearly the CMOS battery has (understandably) failed:

> ? comBIOS Monitor Commands boot [drive][:partition] INT19 Boot reboot cold boot download download a file using XMODEM/CRC flashupdate update flash BIOS with downloaded file time [HH:MM:SS] show or set time date [YYYY/MM/DD] show or set date d[b|w|d] [adr] dump memory bytes/words/dwords e[b|w|d] adr value [...] enter bytes/words/dwords i[b|w|d] port input from 8/16/32-bit port o[b|w|d] port value output to 8/16/32-bit port run adr execute code at adr cmosread [adr] read CMOS RAM data cmoswrite adr byte [...] write CMOS RAM data cmoschecksum update CMOS RAM Checksum set parameter=value set system parameter to value show [parameter] show one or all system parameters ?/help show this help > show ConSpeed = 19200 ConLock = Enabled ConMute = Disabled BIOSentry = Enabled PCIROMS = Enabled PXEBoot = Enabled FLASH = Primary BootDelay = 5 FastBoot = Disabled BootPartition = Disabled BootDrive = 80 81 F0 FF ShowPCI = Enabled Reset = Hard CpuSpeed = Default > time Current Date and Time is: 1980/01/01 00:56:47

Another bit of archeology: I had documented various outages with my ISP... back in 2003!

[root@roadkiller ~/bin]# cat ppp_stats/downtimes.txt 11/03/2003 18:24:49 218 12/03/2003 09:10:49 118 12/03/2003 10:05:57 680 12/03/2003 10:14:50 106 12/03/2003 10:16:53 6 12/03/2003 10:35:28 146 12/03/2003 10:57:26 393 12/03/2003 11:16:35 5 12/03/2003 11:16:54 11 13/03/2003 06:15:57 18928 13/03/2003 09:43:36 9730 13/03/2003 10:47:10 23 13/03/2003 10:58:35 5 16/03/2003 01:32:36 338 16/03/2003 02:00:33 120 16/03/2003 11:14:31 14007 19/03/2003 00:56:27 11179 19/03/2003 00:56:43 5 19/03/2003 00:56:53 0 19/03/2003 00:56:55 1 19/03/2003 00:57:09 1 19/03/2003 00:57:10 1 19/03/2003 00:57:24 1 19/03/2003 00:57:25 1 19/03/2003 00:57:39 1 19/03/2003 00:57:40 1 19/03/2003 00:57:44 3 19/03/2003 00:57:53 0 19/03/2003 00:57:55 0 19/03/2003 00:58:08 0 19/03/2003 00:58:10 0 19/03/2003 00:58:23 0 19/03/2003 00:58:25 0 19/03/2003 00:58:39 1 19/03/2003 00:58:42 2 19/03/2003 00:58:58 5 19/03/2003 00:59:35 2 19/03/2003 00:59:47 3 19/03/2003 01:00:34 3 19/03/2003 01:00:39 0 19/03/2003 01:00:54 0 19/03/2003 01:01:11 2 19/03/2003 01:01:25 1 19/03/2003 01:01:48 1 19/03/2003 01:02:03 1 19/03/2003 01:02:10 2 19/03/2003 01:02:20 3 19/03/2003 01:02:44 3 19/03/2003 01:03:45 3 19/03/2003 01:04:39 2 19/03/2003 01:05:40 2 19/03/2003 01:06:35 2 19/03/2003 01:07:36 2 19/03/2003 01:08:31 2 19/03/2003 01:08:38 2 19/03/2003 01:10:07 3 19/03/2003 01:11:05 2 19/03/2003 01:12:03 3 19/03/2003 01:13:01 3 19/03/2003 01:13:58 2 19/03/2003 01:14:59 5 19/03/2003 01:15:54 2 19/03/2003 01:16:55 2 19/03/2003 01:17:50 2 19/03/2003 01:18:51 3 19/03/2003 01:19:46 2 19/03/2003 01:20:46 2 19/03/2003 01:21:42 3 19/03/2003 01:22:42 3 19/03/2003 01:23:37 2 19/03/2003 01:24:38 3 19/03/2003 01:25:33 2 19/03/2003 01:26:33 2 19/03/2003 01:27:30 3 19/03/2003 01:28:55 2 19/03/2003 01:29:56 2 19/03/2003 01:30:50 2 19/03/2003 01:31:42 3 19/03/2003 01:32:36 3 19/03/2003 01:33:27 2 19/03/2003 01:34:21 2 19/03/2003 01:35:22 2 19/03/2003 01:36:17 3 19/03/2003 01:37:18 2 19/03/2003 01:38:13 3 19/03/2003 01:39:39 2 19/03/2003 01:40:39 2 19/03/2003 01:41:35 3 19/03/2003 01:42:35 3 19/03/2003 01:43:31 3 19/03/2003 01:44:31 3 19/03/2003 01:45:53 3 19/03/2003 01:46:48 3 19/03/2003 01:47:48 2 19/03/2003 01:48:44 3 19/03/2003 01:49:44 2 19/03/2003 01:50:40 3 19/03/2003 01:51:39 1 19/03/2003 11:04:33 19 19/03/2003 18:39:36 2833 19/03/2003 18:54:05 825 19/03/2003 19:04:00 454 19/03/2003 19:08:11 210 19/03/2003 19:41:44 272 19/03/2003 21:18:41 208 24/03/2003 04:51:16 6 27/03/2003 04:51:20 5 30/03/2003 04:51:25 5 31/03/2003 08:30:31 255 03/04/2003 08:30:36 5 06/04/2003 01:16:00 621 06/04/2003 22:18:08 17 06/04/2003 22:32:44 13 09/04/2003 22:33:12 28 12/04/2003 22:33:17 6 15/04/2003 22:33:22 5 17/04/2003 15:03:43 18 20/04/2003 15:03:48 5 23/04/2003 15:04:04 16 23/04/2003 21:08:30 339 23/04/2003 21:18:08 13 23/04/2003 23:34:20 253 26/04/2003 23:34:45 25 29/04/2003 23:34:49 5 02/05/2003 13:10:01 185 05/05/2003 13:10:06 5 08/05/2003 13:10:11 5 09/05/2003 14:00:36 63928 09/05/2003 16:58:52 2 11/05/2003 23:08:48 2 14/05/2003 23:08:53 6 17/05/2003 23:08:58 5 20/05/2003 23:09:03 5 23/05/2003 23:09:08 5 26/05/2003 23:09:14 5 29/05/2003 23:00:10 3 29/05/2003 23:03:01 10 01/06/2003 23:03:05 4 04/06/2003 23:03:10 5 07/06/2003 23:03:38 28 10/06/2003 23:03:50 12 13/06/2003 23:03:55 6 14/06/2003 07:42:20 3 14/06/2003 14:37:08 3 15/06/2003 20:08:34 3 18/06/2003 20:08:39 6 21/06/2003 20:08:45 6 22/06/2003 03:05:19 138 22/06/2003 04:06:28 3 25/06/2003 04:06:58 31 28/06/2003 04:07:02 4 01/07/2003 04:07:06 4 04/07/2003 04:07:11 5 07/07/2003 04:07:16 5 12/07/2003 04:55:20 6 12/07/2003 19:09:51 1158 12/07/2003 22:14:49 8025 15/07/2003 22:14:54 6 16/07/2003 05:43:06 18 19/07/2003 05:43:12 6 22/07/2003 05:43:17 5 23/07/2003 18:18:55 183 23/07/2003 18:19:55 9 23/07/2003 18:29:15 158 23/07/2003 19:48:44 4604 23/07/2003 20:16:27 3 23/07/2003 20:37:29 1079 23/07/2003 20:43:12 342 23/07/2003 22:25:51 6158

Fascinating.

I suspect the (IDE!) hard drive might be failing as I saw two new files created in /var that I didn't remember seeing before:

-rw-r--r-- 1 root wheel 0 Jan 18 22:55 3@T3 -rw-r--r-- 1 root wheel 0 Jan 18 22:55 DY5

So I shutdown the machine, possibly for the last time:

Waiting (max 60 seconds) for system process `bufdaemon' to stop...done Waiting (max 60 seconds) for system process `syncer' to stop... Syncing disks, vnodes remaining...3 3 0 1 1 0 0 done All buffers synced. Uptime: 36m43s usbus0: Controller shutdown uhub0: at usbus0, port 1, addr 1 (disconnected) usbus0: Controller shutdown complete usbus1: Controller shutdown uhub1: at usbus1, port 1, addr 1 (disconnected) usbus1: Controller shutdown complete The operating system has halted. Please press any key to reboot.

I'll finally note this was the last FreeBSD server I personally operated. I also used FreeBSD to setup the core routers at Koumbit but those were replaced with Debian recently as well.

Thanks Soekris, that was some sturdy hardware. Hopefully this new Protectli router will live up to that "decade plus" challenge.

Not sure what the fate of this device will be: I'll bring it to the next Montreal Debian & Stuff to see if anyone's interested, contact me if you can't show up and want this thing.

Categories: FLOSS Project Planets

Python⇒Speed: Profiling your Numba code

Mon, 2024-01-29 19:00
pre { font-size: 90% !important; }

If you’re writing numeric Python code, Numba can be a great way to speed up your program. By compiling a subset of Python to machine code, Numba lets you write for loops and other constructs that would be too slow in normal Python. In other words, it’s similar to Cython, C, or Rust, in that it lets you write compiled extensions for Python.

Numba code isn’t always as fast as it could be, however. This is where profiling is useful: it can find at least some of the bottlenecks in your code.

In this article we’ll cover:

  • Profila, a new profiler I’ve released that is specifically designed for Numba code.
  • The limits of profiling. There are many potential performance enhancements that a profiler can’t and won’t help you discover.
Read more...
Categories: FLOSS Project Planets

Kay Hayen: Nuitka this week #15

Mon, 2024-01-29 18:00

This is a weekly update, or at least it’s supposed to be of what’s going on in Nuitka land, for you to learn about ongoing developments and important changes to the project.

In this issue, I am first going to cover a bit of backlog from news update missed in the past, but also covering very exciting changes from this week.

Contents

Nuitka Action

A GitHub Action is a component used in GitHub workflows. These are yaml driven configurations that can cause GitHub to do automatic building of your software.

Many of the more professional users build their binaries as part of GitHub workflows, and Nuitka and Nuitka commercial are both used in that way a lot. Many times people do it on their own, i.e. install Nuitka by hand, and call it by hand, which is kind of not the preferred way for many people.

Enter the great Nuitka Action which was originally created by Jim Kring, who handed over the maintenance of it to the Nuitka organization that has further refined it. This was a great contribution that makes it everything easier for Nuitka users on GitHub if they want to use it.

- name: Build Executable uses: Nuitka/Nuitka-Action@main with: nuitka-version: main script-name: kasa_cli onefile: true - name: Upload Artifacts uses: actions/upload-artifact@v3 with: name: ${{ runner.os }} Build path: | build/*.exe build/*.bin build/*.app/**/*

Options of Nuitka are exposed as yaml attributes. The documentation of this mapping could be very much enhanced, but basically it’s just dropping the -- part from e.g. --onefile and for toggles, you say true.

Now one interesting limitation of GitHub Action, I have come across this week and that is that it’s not easily possible to specify an option twice. For some values in Nuitka, that however is necessary. Where module names are acceptable, a , separation os supported, but with file paths, we don’t do that, e.g. not with --include-data-dir (note --include-package-data is much better to use) but that is the one it came up for.

But now we support splitting by new-line from GitHub actions for everything that produces a list value as a Nuitka option. See below for a very nice example of how the | in Yaml makes that even easy to read.

- name: Build Executable uses: Nuitka/Nuitka-Action@main with: nuitka-version: main script-name: kasa_cli onefile: true include-data-dir: | source_path_dir1=dest_path_dir1 source_path_dir2=dest_path_dir2 source_path_dir3=dest_path_dir3

Note

This works with Nuitka 2.0 or higher.

The Nuitka-Action is permanently refined. Just today we updated its caching action to latest, and there is an ongoing activity to improve options. We started to generate options from Nuitka help output directly, so that it is easier to add support for new Options in Nuitka, and to generally make them more consistent.

Community

On the Discord server, you can get in touch with an ever more vibrant community of Nuitka users. You are welcome to join us on the Discord server for Nuitka community where you can hang out with the developers and ask questions. It’s not intended as an interactive manual. You are supposed to read the docs for yourself first. And issues are best reported to GitHub.

I am also now occasionally on the Python Discord server. Mostly when I get summoned to answer questions that my community thinks make sense, and have been awarded the community role there, which is pretty nice. I seem to make new connections there.

Optimization Work

For me this is extremely exciting, this has been on my nerves for a long time, and I didn’t have the time to figure it out. Now for the scalability work, I wanted to make sure the algorithm used for loop type analysis is actually going to be sustainable, before I optimize the implementation to scale better.

And low and behold, one of my oldest code examples, the one I mean to demonstrate C type performance from Python code with, has failed to get proper results for a long time now. But this changed this week and it’s part of the 2.0 release, making it my mind worth the bump itself. Checkout this annotated code.

# Initially the value of undefined "i" is "NUITKA_NINT_UNASSIGNED" # in its indicator part. The C compiler will remove that assignment # as it's only checked in the assignment coming up. i = 0 # Assignment from a constant, produces a value where both the C # and the object value are value. This is indicated by a value # of "NUITKA_NINT_BOTH_VALID". The code generation will assign # both the object member from a prepared value, and the clong # member to 0. # For the conditional check, "NUITKA_NINT_CLONG_VALID" will # always be set, and therefore function will resort to comparing # that clong member against 9 simply, that will always be very # fast. Depending on how well the C compiler can tell if an overflow # can even occur, such that an object might get created, it can even # optimize that statically. In this case it probably could, but we # do not rely on that to be fast. while i < 9: # RICH_COMPARE_LT_CBOOL_NINT_CLONG # Here, we might change the type of the object. In Python2, # this can change from ``int`` to ``long``, and our type # analysis tells us that. We can consider another thing, # not "NINT", but "NINTLONG" or so, to special case that # code. We ignore Python2 here, but multiple possible types # will be an issue, e.g. list or tuple, float or complex. # So this calls a function, that returns a value of type # "NINT" (actually it will become an in-place operation # but lets ignore that too). # That function is "BINARY_OPERATION_ADD_NINT_NINT_CLONG"(i, 1) # and it is going to check if the CLONG is valid, add the one, # and set to result to a new int. It will reset the # "NUITKA_NINT_OBJECT_VALID" flag, since the object will not be # bothered to create. i = i + 1 # Since "NUITKA_INT_OBJECT_VALID" not given, need to create the # PyObject and return it. return i

Now that the loop analysis works, I will be much happier to make the value trace collection faster. I will describe it when I do it. From here on for optimization, the C type NINT needs to be created and code generation for the branching helper functions be added, and then the should see this perform perfectly.

Functions like RICH_COMPARE_LT_CBOOL_NINT_CLONG will look like this. We do not yet have RICH_COMPARE_LT_CBOOL_LONG_CLONG which it will fall back to, but we did RICH_COMPARE_LT_CBOOL_INT_CLONG for Python2 a while ago, and we could expand that no problem.

extern bool RICH_COMPARE_LT_CBOOL_NINT_CLONG(nuitka_long *operand1, long operand2) { if (operand1->validity & NUITKA_LONG_VALUE_VALID) { return operand1->long_value < operand2; } else { return RICH_COMPARE_LT_CBOOL_LONG_CLONG(operand1->long_object, operand2); } }

Once I get to that, performance will get a hot topic. From there then, adding sources of type information, be it profile guided compilation, be it type annotations, be it ever better compile time type inference, will start to make a lot more sense.

Nuitka 2.0

The 2.0 release has been made. I am going to announce it separately. I am usually waiting for a few days, to settle potentially regressions. This time older C compiler support needed a fixup, there is always something. And then I announce it when I feel that the regressions are gone and that new users will not encounter obvious breakage at all.

Technical Writer

When I first launched Nuitka commercial, I needed to get myself financially supported, dropping my day job after 20 years. I am willing to say that has happened.

Now as you all know, Nuitka is technically excellent. I cannot say the same thing of the documentation. Large parts of Nuitka commercial are still not even publicly described. The User Manual of Nuitka is good, but not nearly as good as it should be. The website is kind of unorganized. It’s pretty obvious that my talent is not there. I have received plenty of help over the years, but it’s not been enough to match Nuitka and Nuitka commercial outstanding quality.

So, what I did this week, after seeing that my financial projection for the coming years seems to allow it, is to attempt and hire people on a free lancing basis. The first step is a technical writer. She will know very little of Python and even the terminal, but she will know how to organize and improve the content of Nuitka.

It will take time for her to get there and this is very fresh.

Nuitka Website as a DevContainer

As a first necessary step to make it easier to contribute to the Nuitka documentation, the website repo, has gained DevContainer configuration. It will install a small Ubuntu via docker (or podman if you configured Visual Code like that), and run the pipenv environment and a daemon to open the website.

The docs for that are spotty right now, and the Technical Writer that is using that, is tasked to improve this right now.

It should become really easy that way to contribute enhancements to the documentation.

I have yet to figure out how to handle the release matching documentation vs. website documentation for user manual. But the idea is certainly that the Nuitka documentation is edited on the website.

Nuitka Website Refinements

With the DevContainer the need for translation and staging sites is gone. The Nuitka/Website has been disbanded, since it was only used to control access to “live” rendered branches of the website, that are no more.

As part of the DevContainer process, the website build was changed to Python 3.10 so that Ubuntu image is easier to use (was Debian 3.9 so far). The used tools got all upgraded, and many small improvements came out of it. Links got checked after the upgrade, finding a few broken ones, and the translation dropdown is now only present when there are actual translations. Previously e.g. all posts were having them, which made no sense.

Making the container smoother to use will be an ongoing process. How to integration Nuitka auto-format in an easy fashion is still being looked at.

Nuitka Package Configuration

So I added a post that explains variables, but the one for parameters, I still need to do that and also update the actual reference documentation.

Teasers

Future TWN still have a lot to talk about, we will speak about Nuitka-Python (our own Python fork with incredible capabilities), about Nuitka-Watch (our way of making sure Nuitka works with PyPI packages and hot-fixes to not regress), about compilation reports as a new feature, Windows AV stuff, onefile improvements, and so on and so on. I got interesting stuff for many weeks. Limiting myself for now or I will never publish this.

Twitter and Mastodon

I should be more active there, although often I fail due to not wanting to talk about unfinished things, so actually I do not post there as much.

And lets not forget, having followers make me happy. So do re-tweets. Esp. those, please do them.

Help Wanted

Nuitka definitely needs more people to work on it. I hope the technical writer will aid us in better laying how ways for you to help.

System Message: INFO/1 (/home/nuitka-buildslave/slave/site-main-update/build/doc/posts/nuitka-this-week-15.rst, line 7); backlink

Duplicate implicit target name: “help wanted”.

If you are interested, I am tagging issues help wanted and there is a bunch, and very likely at least one you can help with.

Categories: FLOSS Project Planets

Real Python: Python Exceptions: An Introduction

Mon, 2024-01-29 09:00

A Python program terminates as soon as it encounters an error. In Python, an error can be a syntax error or an exception. In this tutorial, you’ll see what an exception is and how it differs from a syntax error. After that, you’ll learn about raising exceptions and making assertions. Then, you’ll get to know all the exception-related keywords that you can use in a try … except block to fine-tune how you can work with Python exceptions.

In this tutorial, you’ll learn how to:

  • Raise an exception in Python with raise
  • Debug and test your code with assert
  • Handle exceptions with try and except
  • Fine-tune your exception handling with else and finally

You’ll get to know these keywords by walking through a practical example of handling a platform-related exception. Finally, you’ll also learn how to create your own custom Python exceptions.

Get Your Code: Click here to download the free sample code that shows you how exceptions work in Python.

Take the Quiz: Test your knowledge with our interactive “Python Exceptions: An Introduction” quiz. Upon completion you will receive a score so you can track your learning progress over time:

Take the Quiz »

Understanding Exceptions and Syntax Errors

Syntax errors occur when the parser detects an incorrect statement. Observe the following example:

Python Traceback >>> print(0 / 0)) File "<stdin>", line 1 print(0 / 0)) ^ SyntaxError: unmatched ')' Copied!

The arrow indicates where the parser ran into the syntax error. Additionally, the error message gives you a hint about what went wrong. In this example, there was one bracket too many. Remove it and run your code again:

Python >>> print(0 / 0) Traceback (most recent call last): File "<stdin>", line 1, in <module> ZeroDivisionError: division by zero Copied!

This time, you ran into an exception error. This type of error occurs whenever syntactically correct Python code results in an error. The last line of the message indicates what type of exception error you ran into.

Instead of just writing exception error, Python details what type of exception error it encountered. In this case, it was a ZeroDivisionError. Python comes with various built-in exceptions as well as the possibility to create user-defined exceptions.

Raising an Exception in Python

There are scenarios where you might want to stop your program by raising an exception if a condition occurs. You can do this with the raise keyword:

You can even complement the statement with a custom message. Assume that you’re writing a tiny toy program that expects only numbers up to 5. You can raise an error when an unwanted condition occurs:

Python low.py number = 10 if number > 5: raise Exception(f"The number should not exceed 5. ({number=})") print(number) Copied!

In this example, you raised an Exception object and passed it an informative custom message. You built the message using an f-string and a self-documenting expression.

When you run low.py, you’ll get the following output:

Python Traceback Traceback (most recent call last): File "./low.py", line 3, in <module> raise Exception(f"The number should not exceed 5. ({number=})") Exception: The number should not exceed 5. (number=10) Copied!

The program comes to a halt and displays the exception to your terminal or REPL, offering you helpful clues about what went wrong. Note that the final call to print() never executed, because Python raised the exception before it got to that line of code.

With the raise keyword, you can raise any exception object in Python and stop your program when an unwanted condition occurs.

Debugging During Development With assert

Before moving on to the most common way of working with exceptions in Python using the try … except block, you’ll take a quick look at an exception that’s a bit different than the others.

Read the full article at https://realpython.com/python-exceptions/ »

[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]

Categories: FLOSS Project Planets

PyCharm: PyCharm 2024.1 EAP 2 Is Out!

Mon, 2024-01-29 08:58

The second EAP build of PyCharm 2024.1 has landed and is now available for you to explore.  

You can download this new build from our website, through the free Toolbox App, or via snaps for Ubuntu.

Download PyCharm 2024.1 EAP

This new build introduces a reworked Terminal tool window and brings an ability to run or debug both client and server in a single npm configuration. Take a look!

Revamped Terminal tool window

PyCharm 2024.1 EAP 2 brings an overhauled terminal with both visual and functional enhancements to make terminal-based tasks simpler and more convenient. This update both improves the tool visually and expands its feature set.

The new Terminal tool window seamlessly integrates with the new UI, aligning it with the IDE’s refreshed look-and-feel, and it comes complete with a new color scheme that enhances readability.

One standout improvement is the presentation of each command in a distinct block. This makes it easy to identify the start and end of each one, enhancing the overall clarity of output. Easily navigate between blocks using the arrow keys or switch the focus between the prompt and output with the ⌘↑ / Ctrl + ↑ and  ⌘↓ / Ctrl + ↓ keyboard shortcuts.

We introduced a convenient command history with filtering options, making navigation through recently executed commands a breeze.

The updated terminal supports only Bash, Zsh, and PowerShell (currently only for Windows 11). We are actively working on supporting more shell integrations.

Run or debug both client and server in a single npm configuration

You can now use the same npm configuration to run and debug both the server and client sides of your application. The new Browser / Live Edit tab in the Run/Debug Configurations editor allows you to set up a run configuration to open a browser automatically after launch. If needed, you can attach the debugger to the opened browser right away.

These are the most notable updates for this week. For the full list of the implemented changes, please consult the release notes. 

Try out these new features and let us know what you think in the comments below or on X (formerly Twitter). If you encounter any bugs, please report them via our issue tracker.

Happy developing!

Categories: FLOSS Project Planets

Brian Okken: pytest 8 is here

Mon, 2024-01-29 07:00
pytest 8.0.0 was released on 17-Jan-2024, and I’m pretty excited about it. I’m not going to cover all fo the changes, I’ll just highlight a few. For full set of changes, see the pytest changelog: Changes in 8.0.0rc1 Changes in 8.0.0rc2 Changes in 8.0.0 Version Compatibility Dropped support for Python 3.7, as it reached EOL last June. Features and Improvements Improved Diffs Improved diffs that pytest prints when an assertion fails, including:
Categories: FLOSS Project Planets

Doug Hellmann: virtualenvwrapper 6.1.0 - hook scripts in project directories

Mon, 2024-01-29 03:48
What’s new in 6.1.0? source project-dir/.virtualenvwrapper/predeactivate when deactivating source project_dir/.virtualenvwrapper/postactivate during activation
Categories: FLOSS Project Planets

TestDriven.io: Working with Static and Media Files in Django

Sun, 2024-01-28 17:28
This article looks at how to work with static and media files in a Django project, locally and in production.
Categories: FLOSS Project Planets

TechBeamers Python: IndexError: List Index Out of Range in Python

Sun, 2024-01-28 13:39

An “IndexError: list index out of range” in Python typically occurs when you’re trying to access an index in a list that does not exist. This typically happens when attempting to access an index that is beyond the bounds of the list. In this guide, we’ll explore the causes of this error and discuss various […]

The post IndexError: List Index Out of Range in Python appeared first on TechBeamers.

Categories: FLOSS Project Planets

Awesome Python Applications: Aim

Sat, 2024-01-27 07:55

Aim: Aim is a self-hostable machine learning experiment tracker designed to handle 10,000s of training runs.

Links:

Categories: FLOSS Project Planets

Awesome Python Applications: explainshell.com

Sat, 2024-01-27 07:55

explainshell.com: A web-based tool to match command-line arguments to their man pages and help text.

Links:

Categories: FLOSS Project Planets

Awesome Python Applications: liberapay.com

Sat, 2024-01-27 07:49

liberapay.com: A recurrent donations platform, formerly known as gittip and gratipay.

Links:

Categories: FLOSS Project Planets

Awesome Python Applications: Mathesar

Sat, 2024-01-27 07:43

Mathesar: Self-hostable web application which provides a spreadsheet-like interface to a PostgreSQL database, enabling users of all technical skill levels to design data models, enter data, and build reports.

Links:

Categories: FLOSS Project Planets

Pages