Skip to content

Commit 23200f7

Browse files
authored
v1.0.2
Merge 1.0.2 into master
2 parents ed3deac + 3060c89 commit 23200f7

File tree

6 files changed

+177
-364
lines changed

6 files changed

+177
-364
lines changed

README.md

Lines changed: 13 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -7,36 +7,38 @@
77
### TL;DR: Faster List<Boolean> than ArrayList with the same exact behaviour.
88
- [Documentation](https://abductcows.github.io/java-bit-array/gr/geompokon/bitarray/BitArray.html)
99
- [Release](https://github.com/Abductcows/java-bit-array/releases/latest)
10-
10+
- [Benchmarks](https://github.com/Abductcows/bit-array-benchmarks)
1111
### Motivation
1212
This class is a replacement for the `ArrayList<Boolean>` type when working with its `List` interface. It boasts higher performance in add, remove and set operations and requires less memory for storing the same elements.
1313

1414
The BitArray is by all means an array; it is random access and all elements have contiguous indices at all times. Its behaviour is meant to be indistinguishable from ArrayList in order for programmers to be able to substitute it for the latter and benefit from its performance advantage.
1515

1616
### Few details
17-
Internally the array stores the boolean elements in an array of long primitives. These long primitives essentially form a sequence of bits; their chained bit representation in 2's complement. Boolean elements are converted to and from their bit equivalent to perform insertions, deletions etc. With the appropriate bitwise operations new elements can be added at a specific index and elements already in the array can be shifted to preserve the previous order. Thanks to that hack, element shifting and array resizing is much cheaper, all while the elements themselves occupy less space in memory.
17+
Internally the array stores the boolean elements in an array of long primitives. These long primitives essentially form a sequence of bits; their chained bit representation in 2's complement. Boolean elements are converted to and from their bit equivalent to perform insertions, deletions etc. With the appropriate bitwise operations new elements can be added at a specific index and elements already in the array can be shifted to preserve the previous order. Thanks to this "hack", element shifting and array resizing is much cheaper, all while the elements themselves occupy less space in memory.
1818

19-
### Performance
20-
With regard to the difference in performance, I have included a [temporary benchmark](https://github.com/Abductcows/java-bit-array/blob/dev/src/test/java/gr/geompokon/bitarray/BitArrayVsArrayListBenchmarkTest.java) file for you to test. A conservative report on my findings based on that is that this array is about 4-5 times faster in random index `add` and `remove` operations. `get` and `set` run at about the same time. I am looking into creating a more trustworthy benchmark using a benchmark framework like [JMH](https://github.com/openjdk/jmh) in order to be able to publish some results with confidence. If you have experience doing that and want to contribute, feel free to start an [issue](https://github.com/Abductcows/java-bit-array/issues).
19+
### Thread safety
20+
The class is not currently thread-safe, I will look into it properly at some point. For the time being you can use `Collections.synchronizedList()`
2121

22-
### Disclaimer
23-
As you can tell from the project version and creation date, this class is very new and not battle-tested. As such I would discourage you from using it in a serious application just yet.
22+
### Null values
23+
Unfortunately, due to the implementation I have not been able to accommodate null values in the array. Null insertions or updates will throw NullPointerException.
24+
25+
### Performance
26+
For the performance difference, check out the [benchmark repository](https://github.com/Abductcows/bit-array-benchmarks). It includes results from my runs and the benchmark files should you want to run them yourself. A TLDR version is that it gets much faster the more the elements are in add/remove. The performance difference stems wholly from resizes and moves. For example an insertion at random indices of 1000 elements with an initial capacity of 10 runs at 2x the speed. Same scenario but for 1.5M elements and the BitArray runs 13x faster. But for already resized arrays and insertions at the tail, the difference is miniscule. The numbers mentioned are quite conservative for safety. Also, it can easily handle `INTEGER.MAX_VALUE` elements, but cannot hold more.
2427

2528
# Getting Started
26-
You will need the class and source files. You can grab the [latest release](https://github.com/Abductcows/java-bit-array/releases/latest) (built with jdk-11) or download the project and run `mvn package/install` yourself. Releases contain a zip file with separate jars for classes, sources and javadoc. Include at least the class jar in your project and you will be able to use the BitArray. Looks like you are good to go.
29+
You will need the class and source files. You can grab the [latest release](https://github.com/Abductcows/java-bit-array/releases/latest) (built with jdk-11) or download the project and run `mvn install` yourself. Releases contain a zip file with separate jars for classes, sources and javadoc. Include at least the class jar in your project and you will be able to use the BitArray. Looks like you are good to go.
2730

2831
### Versioning
2932
The project uses [SemVer](https://semver.org/) for versioning.
3033

3134
### Contributions and future of this project
3235
I would like to work on this project with anyone willing to contribute. My estimation of the rough priority of actions needed is:
3336

34-
- Testing/debugging: Write better and well documented tests to enhance confidence
35-
- Benchmarking: Give ArrayList a run for their money
37+
- Testing: Improve tests to enhance confidence
3638
- Optimizing: 'cause why not. Maybe override a few of the AbstractList's default implementations
37-
- New features: Not sure what to add, suggestions very welcome
39+
- New features: Not sure if there is anything to add, suggestions very welcome
3840

39-
If you want to contribute, check out [CONTRIBUTING.md](https://github.com/Abductcows/java-bit-array/blob/master/CONTRIBUTING.md) for more info.
41+
I would also appreciate you sharing your opinion on this class and the project as a whole. If you want to contribute, check out [CONTRIBUTING.md](https://github.com/Abductcows/java-bit-array/blob/master/CONTRIBUTING.md) for more info.
4042

4143
### License
4244
This Project is licensed under the [Apache License, Version 2.0](https://www.apache.org/licenses/LICENSE-2.0)

pom.xml

Lines changed: 9 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@
77

88
<groupId>gr.geompokon</groupId>
99
<artifactId>bit-array</artifactId>
10-
<version>1.0.0</version>
10+
<version>1.0.2</version>
1111

1212
<name>BitArray</name>
1313
<description>Java List of Booleans class with reduced memory usage and higher performance</description>
@@ -25,6 +25,14 @@
2525
<groupId>org.junit.jupiter</groupId>
2626
<artifactId>junit-jupiter-engine</artifactId>
2727
<version>5.1.0</version>
28+
<scope>test</scope>
29+
<optional>true</optional>
30+
</dependency>
31+
<dependency>
32+
<groupId>org.junit.jupiter</groupId>
33+
<artifactId>junit-jupiter-params</artifactId>
34+
<version>5.1.0</version>
35+
<scope>test</scope>
2836
<optional>true</optional>
2937
</dependency>
3038
</dependencies>

src/main/java/gr/geompokon/bitarray/BitArray.java

Lines changed: 76 additions & 59 deletions
Original file line numberDiff line numberDiff line change
@@ -50,7 +50,7 @@
5050
* not follow the one bit per entry principle.
5151
* </p>
5252
*
53-
* @version 1.0.0
53+
* @version 1.0.2
5454
* @see java.util.List
5555
* @see java.util.AbstractList
5656
* @see java.util.ArrayList
@@ -198,7 +198,6 @@ public Boolean get(int index) {
198198
public Boolean set(int index, Boolean bit) {
199199
Objects.requireNonNull(bit);
200200
ensureIndexInRange(index, elements - 1);
201-
modCount++;
202201
// get bit indices
203202
int longIndex = getLongIndex(index);
204203
int indexInLong = getIndexInLong(index);
@@ -319,43 +318,41 @@ private void addAndShiftAllRight(boolean bit, int longIndex, int indexInLong) {
319318
int maxLongIndex = getLongIndex(elements);
320319
// add the bit and save the LSB that was shifted out
321320
int bitIntValue = Boolean.compare(bit, Boolean.FALSE);
322-
int rightmostBit = insertInLong(bitIntValue, longIndex++, indexInLong);
321+
long rightmostBit = insertInLong(bitIntValue, 1, longIndex++, indexInLong);
323322
// keep inserting old LSB at 0 of next long and moving on with the new LSB
324323
while (longIndex <= maxLongIndex) {
325-
rightmostBit = insertInLong(rightmostBit, longIndex++, 0);
324+
rightmostBit = insertInLong(rightmostBit, 1, longIndex++, 0);
326325
}
327326
}
328327

329328
/**
330-
* Inserts the bit in the index of the long specified by the arguments and returns the previous LSB.
331-
*
332-
* <p>
333-
* Inserting at any index is done by splitting the long word in two parts and rejoining them after shifting and
334-
* setting the new bit. The LSB that is shifted out is returned.
335-
* </p>
329+
* Inserts the {@code lastLength} rightmost bits of lastValue in the position specified by {@code longIndex} and
330+
* {@code indexInLong}, and then shifts every element with index >= {@code indexInLong} to the right. The bits that
331+
* are shifted out are returned in the leftmost position
336332
*
337-
* @param bit the bit to be inserted
333+
* @param lastValue bits to be inserted into the long
334+
* @param lastLength length in bits of the last value
338335
* @param longIndex index of the long in the {@code data} array
339-
* @param indexInLong index of the bit in the long
340-
* @return LSB of the long before insertion
336+
* @param indexInLong index of the insertion bit in the long
337+
* @return bits that were shifted out due to the insertion
341338
*/
342-
private int insertInLong(int bit, int longIndex, int indexInLong) {
343-
// get right side [indexInLong : ], can not be empty, will be shifted
339+
private long insertInLong(long lastValue, int lastLength, int longIndex, int indexInLong) {
340+
// select the bits [indexInLong, (word end)] for the insertion
344341
long rightSide = (data[longIndex] << indexInLong) >>> indexInLong;
345-
// get left side [0 : indexInLong), can be empty, will remain intact
346-
long leftSide = data[longIndex] - rightSide;
347-
348-
// save LSB
349-
long rightSideLSB = rightSide & 1L;
350-
// unsigned shift to the right to make space for the new bit
351-
rightSide >>>= 1;
352-
// set the new bit
353-
rightSide |= (long) bit << (BITS_PER_LONG - 1 - indexInLong);
342+
// separate the left part, this will remain intact
343+
long leftSide = data[longIndex] & ~rightSide;
344+
345+
// save the bits that will be shifted out
346+
long rightSideShiftOut = selectBits(rightSide, BITS_PER_LONG - lastLength, lastLength);
347+
// unsigned shift to the right to make space for the new bits
348+
rightSide >>>= lastLength;
349+
// set the new bits
350+
rightSide |= lastValue << (BITS_PER_LONG - lastLength - indexInLong);
354351
// re-join the two parts
355-
data[longIndex] = leftSide + rightSide;
352+
data[longIndex] = leftSide ^ rightSide;
356353

357-
// return the LSB
358-
return (int) rightSideLSB;
354+
// return the discarded bits
355+
return rightSideShiftOut;
359356
}
360357

361358
/**
@@ -367,47 +364,63 @@ private int insertInLong(int bit, int longIndex, int indexInLong) {
367364
private void removeAndShiftAllLeft(int longIndex, int indexInLong) {
368365
// start at the end and work back to current long index
369366
int currentLongIndex = getLongIndex(elements - 1);
370-
int leftmostBit = 0; // dud value for first shift
367+
long leftmostBit = 0; // dud value for first shift
371368
// keep adding the old MSB as LSB of the previous long index and shifting the rest to the left
372369
while (currentLongIndex > longIndex) {
373-
leftmostBit = appendBitAndRemoveAtIndex(leftmostBit, currentLongIndex--, 0);
370+
leftmostBit = removeAtIndexAndAppend(leftmostBit, 1, currentLongIndex--, 0);
374371
}
375-
// add the final MSB as LSB of {@code longIndex} and shift only the bits to the removed's right
376-
appendBitAndRemoveAtIndex(leftmostBit, longIndex, indexInLong);
372+
// add the final MSB as LSB of longIndex and shift only the bits to the popped bit's right
373+
removeAtIndexAndAppend(leftmostBit, 1, longIndex, indexInLong);
377374
}
378375

379376
/**
380-
* Appends the bit at the end of the long specified by the arguments and removes the bit at {@code indexInLong}.
381-
*
382-
* <p>
383-
* Since {@code indexInLong} can be at the middle of the long word, removing the bit is done by splitting the
384-
* long in two parts, clearing the desired bit and shifting once to restore the order of the previous bits.
385-
* </p>
377+
* Removes the {@code lastLength} bits from the long specified by {@code longIndex} starting from {@code indexInLong}
378+
* and then appends the same length of bits from {@code lastValue} at the end of the long. The
386379
*
387-
* @param bit the bit to be appended to the long
380+
* @param lastValue bits to be appended to the long
381+
* @param lastLength length in bits of the last value
388382
* @param longIndex index of the long in the {@code data} array
389-
* @param indexInLong index of the bit in the long
390-
* @return bit at {@code longIndex} that was popped out
383+
* @param indexInLong index of the first removed bit in the long
384+
* @return bits that were popped from the long
391385
*/
392-
private int appendBitAndRemoveAtIndex(int bit, int longIndex, int indexInLong) {
386+
private long removeAtIndexAndAppend(long lastValue, int lastLength, int longIndex, int indexInLong) {
393387
// get right side [indexInLong : ], can not be empty, will be shifted
394388
long rightSide = (data[longIndex] << indexInLong) >>> indexInLong;
395389
// get left side [0 : indexInLong), can be empty, will remain intact
396-
long leftSide = data[longIndex] - rightSide;
390+
long leftSide = data[longIndex] & ~rightSide;
391+
392+
// save removed values
393+
long poppedValues = selectBits(rightSide, indexInLong, lastLength) >>> (BITS_PER_LONG - indexInLong - lastLength);
397394

398-
// save MSB
399-
int rightSideMSB = getBitInLong(rightSide, indexInLong);
400-
// clear MSB and shift to the left to make it "disappear"
401-
rightSide &= ~singleBitMask(indexInLong);
402-
rightSide <<= 1;
403-
// append the previous bit
404-
rightSide += bit;
395+
// clear copied bits and shift to the left
396+
rightSide = (rightSide << indexInLong + lastLength) >>> indexInLong;
397+
// append the previous bits
398+
rightSide |= lastValue;
405399

406400
// re-join the two parts
407-
data[longIndex] = leftSide + rightSide;
401+
data[longIndex] = leftSide ^ rightSide;
408402

409-
// return the MSB
410-
return rightSideMSB;
403+
// return the popped bits
404+
return poppedValues;
405+
}
406+
407+
/**
408+
* Returns a long bit mask with ones only in the range [start, start + length)
409+
*
410+
* @param start start index of the selection
411+
* @param length number of set bits in the result
412+
* @return bit mask covering the range specified
413+
* @implSpec <p>
414+
* {@code start} should be in the range [0, 63]<br>
415+
* {@code length} should be in the range [1, 64]<br>
416+
* {@code start} and {@code length} should satisfy: start + length <= {@link #BITS_PER_LONG}
417+
* </p>
418+
*/
419+
private long selectBits(long aLong, int start, int length) {
420+
long mask = Long.MIN_VALUE >>> start; // need at least the first bit
421+
mask |= (Long.MIN_VALUE >>> start) - 1; // make everything to the right ones
422+
mask &= -(Long.MIN_VALUE >>> (start + length - 1)); // make everything from end of length and forward 0
423+
return aLong & mask;
411424
}
412425

413426
/**
@@ -430,6 +443,11 @@ private void ensureIndexInRange(int index, int endInclusive) {
430443
* </p>
431444
*/
432445
private void ensureCapacity() {
446+
// check for completely full array
447+
if (elements == Integer.MAX_VALUE) {
448+
throw new IllegalStateException("Cannot insert; array is completely full. Size = " + size());
449+
}
450+
// extend if currently full
433451
if (elements == data.length * BITS_PER_LONG) {
434452
doubleSize();
435453
}
@@ -439,7 +457,10 @@ private void ensureCapacity() {
439457
* Doubles the size of the array.
440458
*/
441459
private void doubleSize() {
442-
resize(2 * elements);
460+
// make sure new element count does not overflow
461+
// we can't index more than Integer.MAX_VALUE elements through the List interface anyway
462+
int newSize = (int) Math.min(2L * elements, Integer.MAX_VALUE);
463+
resize(newSize);
443464
}
444465

445466
/**
@@ -496,7 +517,6 @@ private int longsRequiredForNBits(int nBits) {
496517
(double) nBits / BITS_PER_LONG);
497518
}
498519

499-
500520
/*
501521
BitArray specific methods
502522
*/
@@ -557,17 +577,14 @@ public String toString() {
557577
*/
558578
public static BitArray fromString(String stringArray) {
559579

560-
String start = "Size = ";
580+
final String start = "Size = ";
561581

562582
if (!stringArray.startsWith(start)) {
563583
throw new UnknownFormatConversionException("Not a valid BitArray string");
564584
}
565585

566586
// count number of digits of the array size
567-
int currentIndex = start.length();
568-
while (stringArray.charAt(currentIndex) != ',') {
569-
currentIndex++;
570-
}
587+
int currentIndex = stringArray.indexOf(",", start.length());
571588

572589
int arraySize;
573590
try {

0 commit comments

Comments
 (0)