### tree data structure

#############################
####   tree structure    ####
#############################

- the "root" node linked with child nodes of different depth thru a hierarchical strcture.

[]            // root
|
[]----[]         // child (or "parent" from its child perspective)
|      |
[]---[] []--[]      // grand child, leaf

- a node can have zero to multiple child nodes, depending on the requirement.
- often comes with a specific search algo technique.
- a tree where a node has only the max of two child nodes is called a "binary" tree.
- nodes with no child are sometimes called "leaf" nodes.
- the "depth/height" of a tree is the number of hops/edges between the root and the furthest/deepst child node.
(so the above tree is height 2. root is height zero. height usually refers to the deepest depth of the tree, whereas depth can be used to describe "depth X nodes in a height Y tree" etc)
- sometimes you hear the term descendant/ancestor nodes.

##
## full binary tree: a binary tree where every node has either two or zero child.
##

##
## perfect/complete binary tree: a full binary tree where every leaf node has the same depth.
##

###
###  balanced (binary) tree: for every node, the height of its left subtree and right subtree differ at most by one.
###

===> this is useful, because the min path from the root to the furthest node is minimized.
===> but obviously, not always efficient, e.g. when you have certain items more frequently accessed, etc, then you might want to build an imbalanced tree.
===> a "balanced" "binary" tree is always a classic CS topic, many implementations. (AVL tree, red-black tree, AA tree, splay tree, etc)

(ref)
http://stackoverflow.com/questions/8015630/definition-of-a-balanced-tree
http://en.wikipedia.org/wiki/Self-balancing_binary_search_tree

an example C++ struct definition for a LL-based binary tree node would look like below.

------------------------------------

template<typename T> strcut node{
struct node* left;
struct node* right;
struct node* parent;    // often this is omitted when the binary tree is used for DFS searching purpose only.

------------------------------------

====> intuitive, but we can also express a tree structure using array.

e.g. we declare an array,

root:  arr[1]
left:  arr[2i]
right: arr[2i+1]

[0][1][2][3][4][5][6][7][8][9]

====> we will see a real example in the heap sort section.

(ref)
http://en.wikipedia.org/wiki/Tree_(data_structure)
http://ppp-lab.sakura.ne.jp/ProgrammingPlacePlus/algorithm/data_struct/007.html

#####################################
####    tree traversal methods    ###
#####################################

There are two kinds:
(1) depth-first search (DFS)      // aka backtracking

===> the name essentially says everything, both are traversal methods.
===> depth-first explores each branch to the end then backtrack (both recursive and non-recursive are possible). commonly used in binary search tree.
===> breadth-first explores all immediate nearest-hop nodes, then their nearest, and so on. commonly used in graph algo.

(ref)
http://en.wikipedia.org/wiki/Depth-first_search

for binary tree, (1) depth-first search is commonly used.
there are three kinds of DFS traversal(search) methods.

(1) pre-order      //  root, left, right
(2) in-order       //  left, root, right
(3) post-order     //  left, right, root        // imagine, you want to free the whole tree where you malloc'ed each node, then this is the choice.

if we use breadth-first search, then we traverse every depth(level) one by one.

----------------------------------------// BFS_print_tree psuedo code
Q.enqueue(root);
while(!Q.isEmpty())

u = Q.dequeue();
print u.data;
if( !u.left ) Q.enqueue(u.left);
if( !u.right) Q.enqueue(u.right);

----------------------------------------

(ref)
http://en.wikipedia.org/wiki/Tree_traversal

######################################
####   binary search tree (BST)   ####
######################################

at any given subtree, the value of each node has to satisfy the below relation.

left_child < parent < right_child

so searching for an elem in a binary tree can be done in O(log n), "assuming" you keep it balanced.
an ultimately imbalanced tree is essentially just a LL. so the complexity can be O(n)

NOTE: how to deal with if the same value, but to simplify, lets not allow dupe elems.
NOTE: depending on how you construct the BST, it can be inefficient.

e.g.

[4]
|
[2]---[5]
|     |
[1]-[3][]-[6]

===> a well balanced tree

e.g.

[6]
|
[5]-[]
|
[4]-[]
|
[3]-[]
|
[2]-[]
|
[1]-[]

===> an imbalanced tree

###############################################
####   delete(node) operation for a tree   ####
###############################################

when deleting a node, you need to promote a descendant node, specifically either (1) the right most (biggest) node of the left subtree, or (2) the left most (smallest) node of the right subtree.

e.g.

[123]
|
[55]-------[777]
|           |
[3]--[66]  [150]--[9999]
|
[60]-[]
|
[59]-[61]

====> if you delete [123], then promote either [66] or [150] in its place.
====> notice how we leave [66]'s left subtree in its place

[66]
|
[55]-------[777]
|           |
[3]--[60]  [150]--[9999]
|
[59]-[61]

#################################################################
####   DFS traversal of a BST implementation example (C++)   ####
#################################################################

lets do all (1) pre-order, (2) in-order, (3) post-order, in a simple BST tree. no self-balancing ability.

---------------------------------------------------------- // bt.h

#include <iostream>
using namespace std;

template<typename T> struct node

node(T t): left(0),right(0){ data = t; }
T data;
struct node* left;
struct node* right;
};

template<typename T> class BT

public:
BT(): root(0){}
void del(T);
int search(T);
void preOrder();
void inOrder();
void postOrder();
private:
void m_preOrder(struct node<T>*);
void m_inOrder(struct node<T>*);
void m_postOrder(struct node<T>*);
struct node<T>* root;
};

struct node<T>* new_node = new node<T>(t);
struct node<T>* tmp = root;
if(root == 0){
root = new_node;
return;
}

while(1)
{
if(tmp->data == t)
return;
else if(tmp->data > t)
if(tmp->left == 0)
tmp->left = new_node;
else
tmp = tmp->left;
else
if(tmp->right == 0)
tmp->right = new_node;
else
tmp = tmp->right;
}

template<typename T> void BT<T>::del(T t)

if(root == 0)
return;

struct node<T>** tmp = &root;
struct node<T>* it = root;
// as explained in pointer/reference article, *tmp = 0 will really overwrite root = 0, but it = 0 will not change root = 0
/*

it -> node_A -> node_B
root -> node_A -> node_B

so, while overwriting it will only change where that particular pointer points to, if you overwrite it->next = 0,
that will overwrite root->next as well. because both are essentially node_A->next pointer

*/
struct node<T>** sub;
struct node<T>* subit;

while(1)
{
it = *tmp;
break;
else if(it->data == t)   // found the guy to delete
{
if(it->left == 0 && it->right == 0) // if no child just delete it
*tmp = 0;
else if(it->left == 0) // only the right node present, then promote it
*tmp = it->right;
else if(it->right == 0) // only the left node present, then promote it
*tmp = it->left;
else // both child nodes present, then promote the right-most leaf(biggest val) of the left subtree
{  // but be careful, that right-most leaf node X can have its own left subtree Y which we need to leave in X's place

sub = &(it->left);   // this sub/subit business is the same deal as tmp/it trick
while(1)
{
subit = *sub;
if(subit->right == 0)
break;
sub = &(subit->right);
}
*sub = subit->left;   // replacing the right-most leaf(X) of the left subtree with the left subtree of X
*tmp = subit;
(*tmp)->right = it->right;
(*tmp)->left = it->left;
}
delete it;
}
else if(it->data > t)
tmp = &(it->left);    // precisely due to the above pointer explanation we need to do stuff like this
else
tmp = &(it->right);
}

template<typename T> int BT<T>::search(T t)

int depth = 0;
struct node<T>* temp = root;

while(1)
{
if(temp == 0)
return -1;                // depth -1 is for no match.
else if(temp->data == t)
return depth;
else if(temp->data > t)
temp = temp->right;
else
temp = temp->left;
depth++;
}

template<typename T> void BT<T>::preOrder()

m_preOrder(root);

template<typename T> void BT<T>::m_preOrder(struct node<T>* tmp)

if(tmp == 0)
return;

cout << tmp->data << endl;
m_preOrder(tmp->left);
m_preOrder(tmp->right);

template<typename T> void BT<T>::inOrder()

m_inOrder(root);

template<typename T> void BT<T>::m_inOrder(struct node<T>* tmp)

if(tmp == 0)
return;

m_inOrder(tmp->left);
cout << tmp->data << endl;
m_inOrder(tmp->right);

template<typename T> void BT<T>::postOrder()

m_postOrder(root);

template<typename T> void BT<T>::m_postOrder(struct node<T>* tmp)

if(tmp == 0)
return;

m_postOrder(tmp->left);
m_postOrder(tmp->right);
cout << tmp->data << endl;

----------------------------------------------------------
---------------------------------------------------------- // main.cpp

#include "bt.h"
#include <iostream>
using namespace std;

int main()

BT<int> bt;
bt.add(9999);   // insert to right right
bt.add(66);     // insert to left right
bt.add(150);    // insert to right left
bt.add(3);      // insert to left left
bt.add(60);     // insert to left right left
bt.add(7);      // insert to left left left

/*

[123]
|
[55]--------[777]
|            |
[3]----[66]  [150]--[9999]
|       |
[]-[7][60]-[]

*/

cout << "preOrder" << endl;    // (root -> left -> right)
bt.preOrder();                 // 123 55 3 7 66 60 777 150 9999

cout << "inOrder" << endl;     // (left -> root -> right)
bt.inOrder();                  // 3 7 55 60 66 123 150 777 9999

cout << "postOrder" << endl;   // (left -> ritht -> root)
bt.postOrder();                // 7 3 60 66 55 150 9999 777 123

bt.del(9999);   // validates no-child case
bt.del(777);    // validates only-left-child case
bt.del(123);    // validates both-children case
bt.del(3);      // validates only-right-child case

cout << "inOrder" << endl;     // (left -> root -> right)
bt.inOrder();                  // 7 55 60 66 150

/*

[66]
|
[55]--------[150]
|
[7]----[60]

*/

return 0;

----------------------------------------------------------

##################################
####    height of the tree    ####
##################################

psuedo code:
--------------------------------------
func height(node)

if(node == null)
return -1;
else
return 1 + max(height(node->left),height(node->right));

--------------------------------------

===> by changing max() with min() you get the shortest branch height. by comparing the two, you can check if it's BST.

###################################
####   interpolation search    ####
###################################

one simple technique to enhance a binary search is called interpolation search.

like any other binary search, we need the target data set to be "already sorted".

a usual structure for binary search can be either LL-based or array-based, but for this interpolation search technique, we need the array-based structure for random accessibility.

the logic is:
lets say you have the array of size 300. and the data is between 1 to 1000, already sorted.
and your search val is 750. where do you look? probably at the 75% mark of the array. (ASSUMING the uniform distribution of the input data)

if ( target > arr[(first + last) / 2] )

we do

if ( target >  ( (target - arr[first]) / (arr[last] - arr[first]) ) * (last - first)  )

(ref)
http://trivial-contents.com/programming/algorithm/search/interpolation_search.html
http://ppp-lab.sakura.ne.jp/ProgrammingPlacePlus/algorithm/search/005.html
http://en.wikipedia.org/wiki/Interpolation_search

1. 2014-06-10 00:48:36 |
2. Category : data_structure
3. Page View: