You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: book/version_control/notebooks.md
+14-1Lines changed: 14 additions & 1 deletion
Original file line number
Diff line number
Diff line change
@@ -1,4 +1,17 @@
1
-
# Jupyter Notebooks: JSON-format
1
+
## File types and git: text versus binary
2
+
3
+
Version control systems oriented towards software development and programming are typically focused on **text-based files**: files where the contents are viewable on your computer as human-readable text. **Binary files,** on the other hand, are organized and saved with bits (`0`'s and `1`'s) and are not human-readable. Although this may be a simplified description in terms of the way computers store information (you can read more [here](https://en.wikipedia.org/wiki/Binary_file)), it is enough for our purposes to recognize that text-based files are best suited for use with version control system; in other words, your Python code!
4
+
5
+
* Examples of common text-based file extensions are: `txt`, `md`, `csv`, `ipynb`, `py`, `html`, etc.
6
+
* Examples of common binary files are: `pdf`, `ppt`, `xlsx`, `docx`, etc.
7
+
8
+
```{admonition} Try it!
9
+
Try exploring a few files on your computer to confirm wether they are text-based or binary by opening them up in a text editor. You will easily be able to distinguish the difference because one is readable, the other not.
10
+
11
+
Note that in Windows if you are using Notepad (the default), you will want to select "Word Wrap" under the "Format" menu to fit the contents of very long lines within the visible width of the window.
12
+
```
13
+
14
+
## Jupyter Notebooks: JSON-format
2
15
3
16
Jupyter notebooks, `ipynb`, are a special case in the discussion text vs binary. Because while the contents of your Markdown and code cells is saved as text in the file, the output of the code cells is sometimes a binary format. For example, if you create a plot using matplotlib and save the notebook, that plot output will be binary. This unfortunately makes it a little more difficult to use notebooks with version control, but if we are aware of the issue, it is not a problem---we will show you how.
Copy file name to clipboardExpand all lines: book/version_control/version_control.md
+12-26Lines changed: 12 additions & 26 deletions
Original file line number
Diff line number
Diff line change
@@ -56,7 +56,7 @@ Note that in Windows if you are using Notepad (the default), you will want to se
56
56
57
57
## A different way of thinking?
58
58
59
-
As you will see in the other chapters on git, when applied to code, version control takes on a very different appearance than what you are used to with traditional backup software, for example, Microsoft Word auto-save, or cloud-based services like OneDrive, Dropbox or even [Visual Studio Code Share](../install/ide/vsc.md). All of these platforms are set up in a user-friendly way that is _focused on a single file._ This works fine when we are writing a report like a thesis. However, it does **not** work well when it comes to computer programs, because in addition to the files themselves, the _contents of the file_ become critical. As we will see, git is a version control software that allows us to compare and track changes in every character of text within a file, which is very useful when writing code, as well as working with a distributed team of collaborators.
59
+
As you will see in the other chapters on git, when applied to code, version control takes on a very different appearance than what you are used to with traditional backup software, for example, Microsoft Word auto-save, or cloud-based services like OneDrive, Dropbox or even Visual Studio Code Share. All of these platforms are set up in a user-friendly way that is _focused on a single file._ This works fine when we are writing a report like a thesis. However, it does **not** work well when it comes to computer programs, because in addition to the files themselves, the _contents of the file_ become critical. As we will see, git is a version control software that allows us to compare and track changes in every character of text within a file, which is very useful when writing code, as well as working with a distributed team of collaborators.
60
60
61
61
`````{admonition} Tip
62
62
:class: tip
@@ -65,38 +65,24 @@ To use version control (git) effectively, different versions of files are tracke
65
65
**Avoid copying and renaming files as much as possible!**
66
66
`````
67
67
68
-
## git and GitLab
68
+
## git and GitHub
69
69
70
70
**What is git?**
71
71
72
72
[Git](https://git-scm.com/) is a version control system (VCS), used by a wide variety of engineers and software developers to work on projects in parallel together. It provides multiple benefits such as tracking changes to files, working side by side with other people, and the ability to rollback to previous versions of files without losing track of newer changes. It is a free and open sources software.
73
73
74
-
Note that while git is free and can be used on a variety of operating systems, there are many 3rd party softwares that _use_ git directly, or are heavily dependent on git. For example, GitLab and GitHub are two companies that provide cloud-based servers for hosting git repositories, as well as additional features like user groups, discussion channels, and even hosting of websites
75
-
76
-
**What is GitLab?**
77
-
78
-
GitLab is a cloud-based version control system built around git. It provides a lot more features such as Issues, Merge Requests, CI/CD pipelines, etc. TU Delft has a license to use GitLab on our own local webservers---this means that all of the files are stored digitally on the TU Delft campus. This is also why TU Delft has their our "own" GitLab located at `gitlab.tudelft.nl`, rather than the "normal" GitLab at `gitlab.com`, and is also something you will have access to throughout your studies.
79
-
80
-
**What is GitHub?**
81
-
82
-
GitHub is a competitor company to GitLab. It provides very similar services, but they are often called different names, or have slightly different features. Although we will not be using it directly, GitHub provides a free software that is very useful: **GitHub Desktop**! This software allows you to interact in a nice graphical interface with the version control of your files. An alternative is the [Git functionality in VS Code](../workflows/git/intro.md)
83
-
74
+
Note that while git is free and can be used on a variety of operating systems, there are many 3rd party softwares that _use_ git directly, or are heavily dependent on git. For example, GitHub is a company that provide cloud-based servers for hosting git repositories, as well as additional features like user groups, discussion channels, and even hosting of websites. Furthermore, GitHub provides a free software that is very useful: **GitHub Desktop**! This software allows you to interact in a nice graphical interface with the version control of your files.
84
75
85
76
## Main concepts and terminology
86
77
87
78
Here we present a list of the terminology we may use when referring to version control systems (VCS). Do not panic if you do not understand what each of the following means. Later, we will provide a more elaborate explanation with examples. Bear in mind that the list below is not exhaustive, and more terms may show up. Also, if you only do the GUI option, you might not encounter some of them.
88
79
89
-
1.**Repository:** storage, where VCS (git, in our case) store their history of changes and information about who made them.
90
-
1.**Remote (of repository):** a version control repository stored somewhere else and the changes between the two are usually synchronized. We will refer to the Gitlab repository as a *remote*.
91
-
1.**Commit:** Snapshot of the current state of the project. If a commit contains changes to multiple files, all the changes are recorded together.
92
-
1.**Staging:** preparation of files to be committed. During the staging we propose files to be committed.
93
-
2.**Snapshot:** copy of the current version of the entire repository.
94
-
3.**Cloning:** copying (downloading) an existing project on your laptop. Usually, it is done only during the first time of getting the remote repository.
95
-
4.**Tracked (files):** files that Git knows about -- they are either in the staging area or were previously added to the repository.
96
-
5.**Untracked (files):** files that Git does not know about -- they are likely new files that have not been staged yet.
97
-
6.**Pushing:** uploading new commits (changes) to the remote server.
98
-
7.**Pulling:** retrieving new commits from the remote repository.
99
-
8.**Fetching:** check for new changes on the remote repository without pulling them yet.
100
-
9.**Conflict:** when changes made by multiple users to the same file are incompatible, you can get into a conflict. _Helping users resolve those conflicts is one of the key advantages of VCS._
101
-
10.**Branch:** development (time) line. The main development line is called `main`.
102
-
11.**Merge:** combining the commits of two branches, for example, changes on a development branch are merged into the `main` branch.
80
+
-**Repository:** storage, where VCS (git, in our case) store their history of changes and information about who made them.
81
+
-**Remote (of repository):** a version control repository stored somewhere else and the changes between the two are usually synchronized. We will refer to the GitHub repository as a *remote*.
82
+
-**Commit:** Snapshot of the current state of the project. If a commit contains changes to multiple files, all the changes are recorded together.
83
+
-**Cloning:** copying (downloading) an existing project on your laptop. Usually, it is done only during the first time of getting the remote repository.
84
+
-**Pushing:** uploading new commits (changes) to the remote server.
85
+
-**Pulling:** retrieving new commits from the remote repository.
86
+
-**Conflict:** when changes made by multiple users to the same file are incompatible, you can get into a conflict. _Helping users resolve those conflicts is one of the key advantages of VCS._
87
+
-**Branch:** development (time) line. The main development line is called `main`.
88
+
-. **Merge:** combining the commits of two branches, for example, changes on a development branch are merged into the `main` branch.
0 commit comments