Skip to content
mattx edited this page Sep 6, 2023 · 10 revisions

CPython

The third_party/python folder in the repo contains a copy of the CPython 3.6.14 source code, with several patches to improve performance and make use of all the goodies offered by Cosmopolitan Libc, which include: loading pure-python packages from within the executable, backports from Python 3.7, tab-completions in the REPL (read-eval-print-loop) via bestline, size optimization tricks and more!

Screenshot from 2023-09-04 21-12-11

Build Instructions

git clone https://github.com/jart/cosmopolitan && cd cosmopolitan
make o//third_party/python/python.com   # this will build python.com only
make o//third_party/python              # this will run CPython stdlib tests as well
./o/third_party/python/python.com

As part of the repo, python.com can be built as a Makefile target. The configuration and build requirements for python.com are specified in o//third_party/python/python.mk -- no need to run ./configure.

Enhancements

In addition to providing Python v3.6, Cosmopolitan's python.com adds the additional features.

Single File Zip Loading

The files comprising the CPython 3.6.14 standard library are stored within the executable itself, which you can view by:

unzip -vl python.com 

to add your own files to python.com, just:

mkdir ./.python
cp /your/package.py ./.python
zip -r ./python.com ./.python

now you can import your own package within python.com.

Default Arguments File (White-Labeling)

If you wanted python.com to run a specific script on startup instead of opening the REPL, you can create a file called .args and add it to python.com. This is very useful if you want to rename python.com into your own executable and perform only some specific behavior. See the example below:

Screenshot from 2023-09-04 21-46-02

The .args file should have one argument per line. These arguments are inserted before other CLI arguments. The ... indicates that arguments from the command line should be accepted.

Code-Completion Bestline REPL

python.com has been modified to use bestline, enhancing the REPL experience across operating systems:

Screenshot from 2023-09-04 21-19-52

see the copy typed, and the right being suggested in the background? That's bestline in action. It helps a ton when typing in the REPL.

Cosmo Module

python.com provides a special module called cosmo, which provides access to some low-level utilities from the Cosmopolitan Libc library. For example, you could:

import cosmo 

# view number of CPU cycles
# since the program started
print(cosmo.rdtsc() - cosmo.kStartTsc)

def foo(a, b):
  return a+b

# trace C function calls in a block
# (needs python.com.dbg, output logged to stderr)
with cosmo.ftrace() as f:
    foo(2,3)

If you want to read the following documentation:

>>> import cosmo
>>> cosmo?

Which displays

Help on built-in module cosmo:

NAME
    cosmo - Cosmopolitan Libc Module

DESCRIPTION
    This module exposes low-level utilities from the Cosmopolitan library.

    Static objects:

    MODE -- make build mode, e.g. "", "tiny", "opt", "rel", etc.
    IMAGE_BASE_VIRTUAL -- base address of actually portable executable image
    kernel -- o/s platform, e.g. "linux", "xnu", "metal", "nt", etc.
    kStartTsc -- the rdtsc() value at process creation.

FUNCTIONS
    crc32c(bytes, init=0)
        Computes 32-bit Castagnoli Cyclic Redundancy Check.

        Used by ISCSI, TensorFlow, etc.
        Similar to zlib.crc32().

    decimate(bytes)
        Shrinks byte buffer in half using John Costella's magic kernel.

        This downscales data 2x using an eight-tap convolution, e.g.

            >>> cosmo.decimate(b'\xff\xff\x00\x00\xff\xff\x00\x00\xff\xff\x00\x00')
            b'\xff\x00\xff\x00\xff\x00'

        This is very fast if SSSE3 is available (Intel 2004+ / AMD 2011+).

    exit1()
        Calls _Exit(1).

        This function is intended to abruptly end the process with less
        function trace output compared to os._exit(1).

    ftrace()
        Enables logging of C function calls to stderr, e.g.

            with cosmo.ftrace() as F:
                WeirdFunction()

        Please be warned this prints massive amount of text. In order for it
        to work, the concomitant .com.dbg binary needs to be present.

    getcpucore()
        Returns 0-indexed CPU core on which process is currently scheduled.

    getcpunode()
        Returns 0-indexed NUMA node on which process is currently scheduled.

    pledge(promises, execpromises)
        Permits syscall operations, e.g.

            >>> cosmo.pledge(None, None)  # assert support
            >>> cosmo.pledge('stdio rpath tty', None)

        This function implements the OpenBSD pledge() API for
        OpenBSD and Linux, where we use SECCOMP BPF. Read the
        Cosmopolitan Libc documentation to learn more.

    popcount(bytes)
        Returns population count of byte sequence, e.g.

            >>> cosmo.popcount(b'\xff\x00\xff')
            16

        The population count is the number of bits that are set to one.
        It does the same thing as `Long.bit_count()` but for data buffers.
        This goes 30gbps on Nehalem (Intel 2008+) so it's quite fast.

    rdtsc()
        Returns CPU timestamp counter.

    syscount()
        Returns number of SYSCALL instructions issued to kernel by C library.

        Context switching from userspace to kernelspace is expensive! So it is
        useful to be able to know how many times that's happening in your app.

        This value currently isn't meaningful on Windows NT, where it currently
        tracks the number of POSIX calls that were attempted, but have not been
        polyfilled yet.

    unveil(path, permissions)
        Permits filesystem operations, e.g.

            >>> cosmo.unveil('', None)     # assert support
            >>> cosmo.unveil('.', 'rwcx')  # permit current dir
            >>> cosmo.unveil(None, None)   # commit policy

        This function implements the OpenBSD unveil() API for
        OpenBSD and Linux where we use Landlock LSM. Read the
        Cosmopolitan Libc documentation to learn more.

    verynice()
        Makes current process as low-priority as possible.

DATA
    IMAGE_BASE_VIRTUAL = 4194304
    MODE = ''
    kStartTsc = 10992130919507855
    kernel = 'linux'

References

Reading through some of the issues/PRs related to the development of python.com in the Cosmopolitan Libc repo would help understand how python.com came to be: