Advertisement · 728 × 90

Posts by Python4DataScience

Preview
A new era for Open Research Europe European Commission to co-fund new phase of operation for open access publishing platform.

In future, Open Research Europe (ORE) will no longer be open solely to projects funded under the EU Programmes for Research; instead, all research institutions in the participating countries will be able to use the platform free of charge: research-and-innovation.ec.europa.eu/news/all-res...

3 days ago 5 4 0 0
Preview
Open data A topic-based overview of public repositories containing research data. Agricultural sciences: AQUASTAT Dissemination System, Global information system of the Food and Agriculture Organization of t...

We have now significantly expanded our collection of open data and organised it by topic: www.python4data.science/en/latest/da...
#Python #DataScience #OpenData #Agriculture #Biology #Chemistry #Climate #Weather #ComputerNetwork #Energy #Finance #GeoScience #Healthcare #ImageProcessing #Medicine

6 days ago 3 1 0 0
Preview
Artificial Intelligence in the Review Process

The German Research Association (@dfg.de) has established a binding framework for the use of AI. The aim is to ensure the legally compliant and transparent use of AI: www.dfg.de/en/news/news...
#Research #RSE #Science #DataScience #ArtificialIntelligence #AI #Compliance

2 weeks ago 2 1 1 0

We are a community sponsor and would love to meet you at the conference.
#Python #DataScience

2 weeks ago 2 0 0 0
Preview
Configuration You can use configuration files to change the way pytest runs. If you repeatedly use certain options in your tests, such as--verbose or--strict-markers, you can store them in a configuration file s...

… and now we have also moved the pytest configuration to the pyproject.toml file: python-basics-tutorial.readthedocs.io/en/latest/te...
#Python #Testing #pytest

3 weeks ago 2 1 0 0
Preview
tox tox is an automation tool that works similarly to a CI tool, but can be run both locally and in conjunction with other CI tools on a server. In the following, we will set up tox for our Items appli...

Since tox 4.44.0, tox.ini has been frozen. Only the toml configuration is supported for further development. We have therefore updated our tutorial accordingly: python-basics-tutorial.readthedocs.io/en/latest/te... #Python #testing #tox

1 month ago 2 1 1 0

We are pleased that we were able to develop an idea on how to better deal with LLM agents in the future.

1 month ago 2 0 0 0
Preview
Memray Memory usage is difficult to control in Python projects because the language does not explicitly indicate where memory is allocated, module imports can significantly increase consumption, and it is...

We analyse the memory consumption of our applications with memray and also monitor it continuously with pytest-memray: python4data.science/en/latest/pe...
#Python

1 month ago 1 0 0 0
Preview
Git Internals So far, we have looked at how you can use Git to manage the different states of your code. Now we want to show you the data and storage models that underlie Git. Data Model: You will be able to use...

Git 2.53 provides faster insights into the repository structure with 'git repo structure'. However, to better understand this, it is helpful to be more familiar with the Git data and storage models: www.python4data.science/en/latest/pr...
#Git

1 month ago 2 1 0 0
Advertisement
Preview
Performance Python can be used to write and test code quickly because it is an interpreted language that types dynamically. However, these are also the reasons it is slow when performing simple tasks repeatedl...

The section on performance measurements and finding bottlenecks has been significantly expanded to include cProfile/profiling.tracing, tprof, and profiling.sampling/Tachyon: www.python4data.science/en/latest/pe...
#Python #Performance

2 months ago 4 2 0 0
Preview
Document In order for your software package to be useful, documentation is required that describes how your software can be installed, operated, used and improved: Those who want to use your package need in...

We have updated the documentation section with references to README, CONTRIBUTING, CHANGELOG, etc.
python-basics-tutorial.readthedocs.io/en/latest/do...
#Python #Documentation

2 months ago 4 1 0 0
Preview
Extensions Administration: SQLAlchemy Admin for Starlette/FastAPI, Flexible admin interface for SQLAlchemy models. Downloads Contributors Commit activity Licence,, Piccolo Admin, Simple but powerful admin int...

We have updated the FastAPI extensions. It is very surprising to us that millions of extensions are being downloaded that have not been updated for over a year.
www.python4data.science/en/latest/da...
@fastapi.tiangolo.com
#Python #FastAPI #REST

2 months ago 3 1 0 0
Preview
Unicode and character encodings Special characters and escape sequences:\n stands for the newline character and\t for the tab character. Character sequences that begin with a backslash and are used to represent other characters a...

I took a look at the changes coming with Python 3.15 – and I can't wait to put them to productive use. I've already updated our tutorials:
• Performance measurements: www.python4data.science/en/latest/pe...
• Tachyon: www.python4data.science/en/latest/pe...
#Python

3 months ago 5 2 1 0
Preview
pytest pytest is an alternative to Python’s Unittest module that simplifies testing even further. pytest automatically recognises tests based on filenames and functions that start with test_, while unitte...

We have updated the section on pytest with many exciting use cases
* on command line options
* on generating markers
* and on parameterising exceptions
python-basics-tutorial.readthedocs.io/en/latest/te...
#Python #Testing #pytest

4 months ago 4 2 0 0
Precision-Recall-Curve comparison between workspace and HEAD

Precision-Recall-Curve comparison between workspace and HEAD

Receiver operating characteristic (ROC) comparison between workspace and HEAD

Receiver operating characteristic (ROC) comparison between workspace and HEAD

Confusion Matrix comparison between workspace and HEAD

Confusion Matrix comparison between workspace and HEAD

We have updated our tutorial to data management with DVC. It also allows you to create lightweight data science and data modelling workflows and execute them in a parameterised manner: www.python4data.science/en/latest/pr...
#Data #Versioncontrol #Git #DataScience #Modeling #Python

5 months ago 6 3 0 0
Preview
Configuring Claude Code or Cursor for uv How do we configure Claude Code or Cursor to automatically use uv instead of pip for Python package management? Claude Code Claude Code uses CLAUDE.md files to configure your project’s storage and ...

Now we have also described how to use uv reliably for Cursor: www.python4data.science/en/latest/pr...
#CursorAI #Python #Packaging #uv

6 months ago 2 1 0 0
Preview
Configuring Claude Code for uv How do we configure Claude Code to automatically use uv instead of pip for Python package management? Claude Code uses CLAUDE.md files to configure your project’s storage and context, ensuring a co...

We have now described how to create a configuration for Claude Code so that it uses uv reliably: python4data.science/en/latest/pr...
#ClaudeCode #Python #Packaging #uv

6 months ago 3 0 1 0
Preview
pandas pandas is a Python library for data analysis that has become very popular in recent years. On the website, pandas is described thus: „pandas is a fast, powerful, flexible and easy to use open sourc...

Since we have recently been asked frequently whether pandas is slow and whether we should use Polars, Dask or DuckDB instead, we have now provided an initial overview of the various technologies: www.python4data.science/en/latest/wo...
#Python #Performance #DuckDB

6 months ago 6 2 0 0
Preview
Files and directories pathlib implements path operations using pathlib.PurePath and pathlib.Path objects. The os and os.path modules, on the other hand, offer functions that work at a low level with str- and bytes which...

We have now completely switched to pathlib: python-basics-tutorial.readthedocs.io/en/latest/sa...
#Python

6 months ago 4 2 0 0
Advertisement
Preview
Ruff Ruff is an extremely fast Python linter and code formatter written in Rust that can enforce the rules of flake8, isort, perflint, Black, Bandit, and others. In total, Ruff can check over 800 rules....

We have finally documented Ruff – the tool greatly simplifies static code analysis for Python projects: www.python4data.science/en/latest/pr...
#Python #Ruff

7 months ago 3 1 0 0
Preview
Creating a distribution package Distribution Packages are archives that can be uploaded to a package index such as pypi.org and installed with pip. Structure: A minimal distribution package can look like this, for example: pyproj...

We have now updated our packaging tutorial to include PEP 639, which enables SPDX-compliant licensing: python-basics-tutorial.readthedocs.io/en/latest/pa...
#Python #Packaging #SPDX #Licensing

7 months ago 2 1 0 0
Preview
JSON Overview:,,, Data structure support,+-, JSON supports array and map or object structures and many different data types including strings, numbers, boolean, null etc., but no date formats. However, ...

We have added a section on additional JSON tools: www.python4data.science/en/latest/da...
#Python #JSON

7 months ago 2 1 0 0
Preview
Geodata File formats: PMTiles: PMTiles is a general format for tile data addressed by Z/X/Y coordinates. This can be cartographic vector tiles, remote sensing data, JPEG images or similar. HTTP Range Reque...

We have added several geopython libraries: www.python4data.science/en/latest/da...
#Python #Geospatial #GeoPython

7 months ago 2 1 0 0
Preview
Licensing In order for others to use your software, it should have one or more licences that describe the terms of use. Otherwise, it is likely to be protected by copyright. Authors are those who have origin...

We have significantly expanded the section on licences for AI systems: www.python4data.science/en/latest/pr...
#AI #Licensing #OpenData #OpenSource

7 months ago 4 1 0 0

💥Spack v1.0 is out!💥

This is a huge milestone. We reworked the core to add compiler dependencies, and we're introducing a stable package API.

🚀1.0 also adds concurrent builds, better includes, and much more -- read it all in the release notes!

github.com/spack/spack/...

8 months ago 41 16 0 5
XKCD #3117: Replication Crisis

XKCD #3117: Replication Crisis

The XKCD comic on reproducible scientific results fits perfectly with our tutorial 🧐 😉
www.python4data.science/en/latest/pr...

8 months ago 8 0 0 0
Advertisement
Graph from GitHub’s Octoverse 2024 report showing a spike in utilization of Jupyter Notebooks across GitHub. This is calculated by looking at the distinct number of public repositories with at least one Jupyter Notebook by the year the repository was created. Since 2016, we have seen this number surge from near zero to more than 1.5 million repositories using Jupyter Notebooks.

Graph from GitHub’s Octoverse 2024 report showing a spike in utilization of Jupyter Notebooks across GitHub. This is calculated by looking at the distinct number of public repositories with at least one Jupyter Notebook by the year the repository was created. Since 2016, we have seen this number surge from near zero to more than 1.5 million repositories using Jupyter Notebooks.

Almost more significant than the success of #Python is the growth of #Jupyter #Notebooks: “Data scientists and machine learning researchers commonly use the #OpenSource application for #MachineLearning, #DataViz, and more.”
jupyter-tutorial.readthedocs.io/en/latest/in...

8 months ago 22 5 2 0
Preview
Protomaps Protomaps is an open source project for the creation and use of vector maps. It was developed as a lightweight alternative to conventional map providers and offers a number of advantages. Open Sour...

We have added a section on protomaps to our PyViz tutorial. Protomaps makes map visualisations so much easier.
pyviz-tutorial.readthedocs.io/en/latest/pr...
#Protomaps #Geography #World #Map @protomaps.com

10 months ago 7 1 0 1
Preview
Geodata File formats: PMTiles: PMTiles is a general format for tile data addressed by Z/X/Y coordinates. This can be cartographic vector tiles, remote sensing data, JPEG images or similar. HTTP Range Reque...

We have expanded the section on geodata to include the most common (tile) file formats: www.python4data.science/en/latest/da...
#Geography #GIS

10 months ago 8 1 0 0
Preview
Licensing In order for others to use your software, it should have one or more licences that describe the terms of use. Otherwise, it is likely to be protected by copyright. Authors are those who have origin...

And a new section on AI/ML licences has also been added: www.python4data.science/en/latest/pr...
#AI #ML #License

11 months ago 1 1 0 1