-
-
Notifications
You must be signed in to change notification settings - Fork 3.9k
Bytes and string handling in Python 3
Python 3 differs a lot in handling strings and bytes from Python 2 (you can read more about this in this article or in this Pragmatic Unicode talk). Basically, strings (str
) in Python 3 are Unicode by default and “bytes” (bytes
) are lists of integers from 0 to 255 (lists of 8 bits). There is no implicit conversion between str
and bytes
in Python 3, so any conversion needs to be done explicitly using encode
(str
→ bytes
) and decode
(bytes
→ str
) functions.
Throughout Oppia, we typically use strings. However, you may come across bytes in places where there is an interaction with some outside library or API — for example, when standard input or output is read or written, or when data is read from or written to files. Some standard Python libraries also only accept bytes.
The general rule you should follow is to keep all text in Oppia as strings, where possible. If a conversion to bytes is necessary, that conversion should happen as close to the “edges” of the app as possible. So, for example:
- When you receive bytes from some library, immediately convert them to string using decode.
- If you need to use a function that needs bytes, use encode to convert the string to bytes immediately before you call the function.
In the Oppia codebase all data (that we can decide about) should be encoded/decoded using utf-8 encoding (encode('utf-8')
). If you find a case where utf-8 cannot be used, please raise this with the Core Maintainers team.
If, in some case, an external source returns or receives data with a different encoding, it is fine to use that encoding only for that source. However, please first be sure to investigate whether that source can be configured to use utf-8 instead.
Have an idea for how to improve the wiki? Please help make our documentation better by following our instructions for contributing to the wiki.
Core documentation
Developing Oppia
- FAQs
- How to get help
- Getting started with the project
- How the codebase is organized
- Making your first PR
- Debugging
- Testing
- Codebase policies and processes
- Guidelines for launching new features
- Guidelines for making an urgent fix (hotfix)
- Testing jobs and other features on production
- Guidelines for Developers with Write Access to the Oppia Repository
- Release schedule and other information
- Revert and Regression Policy
- Privacy aware programming
- Code review:
- Project organization:
- QA Testing:
- Design docs:
- Team-Specific Guides
- LaCE/CD:
- Developer Workflow:
Developer Reference
- Oppiabot
- Git cheat sheet
- Frontend
- Backend
- Backend Type Annotations
- Writing state migrations
- Calculating statistics
- Storage models
- Coding for speed in GAE
- Adding a new page
- Adding static assets
- Wipeout Implementation
- Notes on NDB Datastore transactions
- How to handle merging of change lists for exploration properties
- Instructions for editing roles or actions
- Protocol buffers
- Webpack
- Third-party libraries
- Extension frameworks
- Oppia-ml Extension
- Mobile development
- Performance testing
- Build process
- Best practices for leading Oppia teams
- Past Events