Introduction to Data Structures
Authors: Darren Yao, Benjamin Qi, Nathan Wang, Abutalib Namazov, Allen Li
Introduces the concept of a data structure, (dynamic) arrays, pairs, tuples.
Data Structures
A data structure determines how data is organized so that information can be used efficiently. Each data structure supports some operations efficiently, while other operations are either inefficient or not supported at all. Since different operations are supported by each data structure, you should carefully evaluate which data structure will work best for your particular problem.
C++
The C++ standard library data structures are designed to store any type of data. We put the desired data type within the <>
brackets when declaring the data structure, as follows:
1vector<string> v;
This creates a vector
structure that only stores objects of type string
.
For our examples below, we will primarily use the int
data type, but note that you can use any data type including string
and user-defined structures.
Essentially, every standard library data structure supports the size()
method, which returns the number of elements in the data structure, and the empty()
method, which returns true
if the data structure is empty, and false
otherwise.
Arrays
Warning!
Technically, you can solve all Bronze problems without using anything from this module aside from arrays. So the rest of this module isn't strictly necessary for Bronze (although highly recommended).
Resources | |||
---|---|---|---|
LCPP |
You already know one of the simplest data structures: the array! In C++11, in addition to normal arrays, there exists an array
class in the STL. For example, an array
of int
s can be initialized using the following line of code:
1array<int, 25> ar;
The array class supports the normal STL operations (such as .empty()
or .size()
) as well as the normal square-bracket access operator:
1ar[5] //access the element at the 5th index.
C++
In C++, arrays initialized locally using either the default syntax (i.e. int arr[10];
) or the array class are initialized to random numbers, because C++ doesn't have built-in memory management. In order to initialize an array to zero, you have several options:
- Use a for loop (or nested for loops).
- Declare the array globally.
- Add braces at the end of the declaration (i.e.
int arr[10]{};
). See LearnCpp for more information regarding this syntax. - Use a built-in function such as
memset(arr, 0, sizeof arr)
orstd::fill(arr, arr+10, 0)
.
Warning!
Make sure you know what memset
actually does before using it! memset
treats the value that is passed to it as an unsigned char
. So for an array of 32-bit integers, memset(arr, -1, sizeof arr)
will set each element to , as you might expect. However, memset(arr, 1, sizeof arr)
will set each element to , not .
Java
Python
Java
Java default Collections
data structures are designed to store any type of object. However, we usually don't want our data structures to only store one type of data, like integers or strings. We do this by putting the desired data type within the <>
brackets when declaring the data structure, as follows:
1ArrayList<String> list = new ArrayList<String>();
This creates an ArrayList
structure that only stores objects of type String
.
For our examples below, we will primarily use the Integer
data type, but note that you can have Collections of any object type, including Strings
, other Collections, or user-defined objects.
Collections
data types always contain an add
method for adding an element to the collection, and a remove
method which removes and returns a certain element from the collection. They also support the size()
method, which returns the number of elements in the data structure, and the isEmpty()
method, which returns true
if the data structure is empty, and false
otherwise.
Python
Python
Lists
The default way to store data in Python is using a list, which can automatically resize itself to accommodate more elements. We can add and delete elements at the end in time. A list can be initialized as follows:
1arr = []
Python lists are generic. This means that they can store any kind of data type, including objects. For example, the following code creates a dynamic array and adds the numbers through to it:
1for i in range(1, 11): # Note that range(i, j) includes i, but does not include j2 arr.append(i)
In Python, we can give a dynamic array an initial size. The code below creates a dynamic array with zeroes.
1arr = [0] * 30
Iterating
We can use a regular for loop to iterate through all elements of a list.
1arr = [1, 7, 4, 5, 2]23for i in range(len(arr)):4 print(arr[i], end = " ")5print()67for element in arr:8 print(element, end = " ")9print()
We can also use iterators. An iterator allows you to traverse a container by pointing to an object within the container. iter(arr)
returns an iterator pointing to the first element of the list arr
.
1arr = [4, 2, 0, 0, 5]2it = iter(arr)34print(next(it)) # 45print(next(it)) # 26print(next(it)) # 0
Inserting and Erasing
1arr = []2arr.append(2) # [2]3arr.append(3) # [2, 3]4arr.append(7) # [2, 3, 7]5arr.append(5) # [2, 3, 7, 5]6arr[1] = 4; # sets element at index 1 to 4 -> [2, 4, 7, 5]7arr.pop(1) # removes element at index 1 -> [2, 7, 5]8# this remove method is O(n); to be avoided9arr.append(8) # [2, 7, 5, 8]10arr.pop() # [2, 7, 5]
List Comprehensions
List comprehensions are extremely useful for simplifying a python for loop that modifies/creates a list into one expression. The general syntax is: [ expression for item in list if conditional ]
An example is provided in the code block below.
1# If a number is odd, add the number times 2 into the array2old_list = [2, 5, 3, 1, 6]3new_list = []4for i in old_list:5 if i % 2 == 1:6 new_list.append(i * 2);7print(new_list) # [10, 6, 2]8# Simplified one liner with list comprehension9# Recall the form [ expression for item in list if conditional ]10# expression: i * 2
A very applicable use of list comprehensions for competitive programming in particular is creating an integer list from space separated input:
1# Example input: 5 3 2 6 8 12# Note that the conditional in the list comprehension is optional, and defaults to True if not provided3arr = [int(x) for x in input().split()]4print(arr) # [5, 3, 2, 6, 8, 1]
For more information on list comprehensions, including nesting them to create multidimensional lists, refer to the below resources.
Resources | |||
---|---|---|---|
PythonForBeginners | Basic list comprehension tutorial | ||
GFG | Nesting list comprehensions |
C++
Dynamic Arrays
Resources | |||
---|---|---|---|
IUSACO | module is based off this | ||
CPH | vectors, strings | ||
PAPS | |||
LCPP |
Dynamic arrays (vector
in C++) support all the functions that a normal array does, and can resize itself to accommodate more elements. In a dynamic array, we can also add and delete elements at the end in time.
For example, the following code creates a dynamic array and adds the numbers through to it:
1vector<int> v;2for(int i = 1; i <= 10; i++){3 v.push_back(i);4}
In C++, we can give a dynamic array an initial size. The code below creates a dynamic array with zeroes.
1vector<int> v(30); // one way2vector<int> v; v.resize(30); // another way
In array-based contest problems, we'll use one-, two-, and three-dimensional static arrays most of the time. However, we can also have static arrays of dynamic arrays, dynamic arrays of static arrays, and so on. Usually, the choice between a static array and a dynamic array is just personal preference.
Unspecified Behavior
1vi res{-1};2int addElement() {3 res.pb(-1);4 return sz(res)-1;5}67void rec(int x) {8 F0R(i,10) {9 res[i] = addElement();10 ps(i,res[i]);
Running the above code with C++17
g++ -std=c++17 bad.cpp -o bad && ./bad
gives the intended output:
0 1 1 2 2 3 3 4 4 5 5 6 6 7 7 8 8 9 9 10
But running with C++14
g++ -std=c++14 bad.cpp -o bad && ./bad
gives:
0 -1 1 -1 2 3 3 -1 4 5 5 6 6 7 7 -1 8 9 9 10
However, the code works correctly if you save the result of addElement()
to an intermediate variable.
1void rec(int x) {2 F0R(i,10) {3 int tmp = addElement();4 res[i] = tmp;5 ps(i,res[i]);6 }7}
The problem is that res[i] = addElement();
only works if addElement()
is evaluated before res[i]
is. If res[i]
is evaluated first, and then addElement()
results in the memory for res
being reallocated, then res[i]
is invalidated. The order in which res[i]
and addElement()
are evaluated is unspecified (at least before C++17).
See this StackOverflow post for some discussion about why this is the case (also a similar issue here).
Iterating
Resources | |||
---|---|---|---|
CPH | |||
CPP | |||
LCPP |
One way to iterate through all elements of a static or dynamic array is to use a regular for loop.
1vector<int> v{1,7,4,5,2};2for (int i = 0; i < int(size(v)); i++){3 cout << v[i] << " ";4}5cout << endl;
We can also use iterators. An iterator allows you to traverse a container by pointing to an object within the container. However, they are not the same thing as pointers.
For example, v.begin()
or begin(v)
returns an iterator pointing to the first element of the vector v
. Apart from the standard way of traversing a vector (by treating it as an array), you can also use iterators:
1for (vector<int>::iterator it = v.begin(); it != v.end(); ++it) {2 cout << *it << " "; //prints the values in the vector using the iterator3}
Here is another way to write this. auto
(since C++11) automatically infers the type of an object:
1vector<int> v{1,7,4,5,2};2for (auto it = begin(v); it != end(v); it = next(it)) {3 cout << *it << " "; //prints the values in the vector using the iterator4}
We can also use a for-each loop.
1for (int element : v) {2 cout << element << " "; //prints the values in the vector3}
Inserting and Erasing
Keep in mind that insertion and erasure in the middle of a vector
are .
1vector<int> v;2v.push_back(2); // [2]3v.push_back(3); // [2, 3]4v.push_back(7); // [2, 3, 7]5v.push_back(5); // [2, 3, 7, 5]6v[1] = 4; // sets element at index 1 to 4 -> [2, 4, 7, 5]7v.erase(v.begin() + 1); // removes element at index 1 -> [2, 7, 5]8// this remove method is O(n); to be avoided9v.push_back(8); // [2, 7, 5, 8]10v.erase(v.end()-1); // [2, 7, 5]
Strings
This section is not complete.
All the vector operations above also work on strings.
Java
Dynamic Arrays
Dynamic arrays (ArrayList
in Java) that support all the functions that a normal array does, and can repeatedly reallocate storage to accommodate more elements as they are added.
In a dynamic array, we can also add and delete elements at the end in time. For example, the following code creates a dynamic array and adds the numbers through to it:
1ArrayList<Integer> list = new ArrayList<Integer>();2for(int i = 1; i <= 10; i++){3 list.add(i);4}
In array-based contest problems, we'll use one-, two-, and three-dimensional static arrays most of the time. However, we can also have static arrays of dynamic arrays, dynamic arrays of static arrays, and so on. Usually, the choice between a static array and a dynamic array is just personal preference.
Iterating
To iterate through a static or dynamic array, we can use either the regular for loop or the for-each loop.
1ArrayList<Integer> list = new ArrayList<Integer>();2list.add(1); list.add(7); list.add(4); list.add(5); list.add(2);3int[] arr = {1, 7, 4, 5, 2};4for(int i = 0; i < list.size(); i++){ // regular5 System.out.println(list.get(i));6}7for(int element : arr){ // for-each8 System.out.println(element);9}
Adding and Removing
We can add and remove at any index of a dynamic array in time.
1ArrayList<Integer> list = new ArrayList<Integer>();2list.add(2); // [2]3list.add(3); // [2, 3]4list.add(7); // [2, 3, 7]5list.add(5); // [2, 3, 7, 5]6list.set(1, 4); // sets element at index 1 to 4 -> [2, 4, 7, 5]7list.remove(1); // removes element at index 1 -> [2, 7, 5]8// this remove method is O(n); to be avoided9list.add(8); // [2, 7, 5, 8]10list.remove(list.size()-1); // [2, 7, 5]
Memory Allocation
One thing to keep in mind when using arrays is the memory limit. Usually the USACO memory limit is 256 MB. To estimate how many values can be stored within this limit:
- Calculate the total memory size in bytes: for 256 MB, that's .
- Divide by the size, in bytes, of an
int
(4), or along long
(8), etc. For example, the number ofint
s that you are able to store is bounded above by . - Be aware that program overhead (which can be very significant, especially with recursive functions) will reduce the amount of memory available.
Pairs & Tuples
If we want to store a collection of points on the 2D plane, then we can use a dynamic array of pairs.
C++
Of course, both vector<vector<int>>
and vector<array<int,2>>
would suffice for this case, but a pair can also store two elements of different types.
C++ Pairs
pair<type1, type2> p
: Creates a pairp
with two elements with the first one being oftype1
and the second one being oftype2
.make_pair(a, b)
: Returns a pair with valuesa
,b
.{a, b}
: With C++11 and above, this can be used as to create a pair, which is easier to write thanmake_pair(a, b)
.pair.first
: The first value of the pair.pair.second
: The second value of the pair.
Example:
1#include <bits/stdc++.h>2using namespace std;34int main() {5 pair<string, int> myPair1 = make_pair("Testing", 123);6 cout << myPair1.first << " " << myPair1.second << endl;7 myPair1.first = "It is possible to edit pairs after declaring them";8 cout << myPair1.first << " " << myPair1.second << endl;9 pair<string, string> myPair2 = {"Testing", "curly braces"};10 cout << myPair2.first << " " << myPair2.second << endl;
C++ Tuples
Of course, we can hold more than two values with something like pair<int,pair<int,int>>
, but it gets messy when you need a lot of elements. In this case, using tuples might be more convenient.
tuple<type1, type2, ..., typeN> t
: Creates a tuple withN
elements, i'th one being oftypei
.make_tuple(a, b, c, ..., d)
: Returns a tuple with values written in the brackets.get<i>(t)
: Returns thei
'th element of the tuplet
. Can also be used to change the element of a tuple.
This operation only works for constant i
. Namely, it is not allowed to do something like the following since i
is not constant:
1tuple<int,int,int> t{3,4,5};2int i = 1; cout << get<i>(t);
tie(a, b, c, ..., d) = t
: Equalsa, b, c, ..., d
to the elements of the tuple accordingly.
Example:
1#include <bits/stdc++.h>2using namespace std;34int main() {5 int a = 3, b = 4, c = 5;6 tuple<int, int, int> t = tie(a, b, c);7 cout << get<0>(t) << " " << get<1>(t) << " " << get<2>(t) << endl;8 get<0>(t) = 7;9 cout << get<0>(t) << " " << get<1>(t) << " " << get<2>(t) << endl;10
Java
Java
Warning: Language Note
Pairs and tuples are not available in Java's standard libraries, though we can work around that.
We can create our own generic Pair class in Java:
1static class Pair<K, V> {2 K first;3 V second;45 public Pair(K first_value, V second_value) {6 first = first_value;7 second = second_value;8 }9}
Then, we can use the Pair
class as follows:
1Pair<Integer, String> p = new Pair(5, "hello");2System.out.println(p.first + " " + p.second);
Python
Warning: Language Note
Pairs are not available in Python. Just use tuples; no need for pairs!
Python Tuples
Use the link below (if you know of a better one, please let us know!) to learn about tuples.
Resources | |||
---|---|---|---|
Tutorialspoint |
Problems
Nothing to see here! To reiterate, arrays of fixed size should suffice for essentially every Bronze problem, but dynamic arrays, pairs, and tuples can greatly simplify implementation at times.
Module Progress:
Join the USACO Forum!
Stuck on a problem, or don't understand a module? Join the USACO Forum and get help from other competitive programmers!