• Welcome to the Internet Infidels Discussion Board.

Algoritms

steve_bank

Diabetic retinopathy and poor eyesight. Typos ...
Joined
Nov 9, 2017
Messages
16,520
Location
seattle
Basic Beliefs
secular-skeptic
I have been doodling to get back into it.

Sorting abd searching are najor areas. Although computer speeds have increased so has the size of data sets. Data ming requires multiple searches and corelations on large data bases.

starting simple finding a a nax or min value is easy. C is my language but it should be readable.

int find_max(int *list[]){ // pass a pointer to the start of the list

// finds max number in an integer array
// array is passed by pointer

int number_entries, i, max_value;

number_entries = szeof(list)/sizeof(list[0]); // Divide total size of list by size of one entry
max_value = list[0]; // initialize to first element c arrays start at 0 not 1

For (i = 1; i < number_entries; i++){
if(list > max_value) max_value = list;
}

return(max_value);

}// find_max() end

integer array 2, 0, 4, 1
max_value 2, 2, 4, 4

I'd have to run it to make sure it worked for negative numbers
 
You won't get the length of the list with sizeof(). Instead, you'll get the size of the list's pointer. You'll need to pass as args the list pointer and the list size.

But otherwise, the code looks correct.


sizeof() is most useful when you have to do something byte-by-byte, like with malloc() or memcpy() or memset().
 
You won't get the length of the list with sizeof(). Instead, you'll get the size of the list's pointer. You'll need to pass as args the list pointer and the list size.

But otherwise, the code looks correct.


sizeof() is most useful when you have to do something byte-by-byte, like with malloc() or memcpy() or memset().

sizeof gives you the total size of the array. Dividing by the size of on element yields the number of entries.f( cint *x[]) tells the complier the pointer is to an array.

f(int*a) is not the same as f(int *a[]). In the latter the compiler sees a as an array of ints. In the former the compiler sees a single int.

int *a, sizeof(a) yields size of the pointer.
int *a[], sizeof(a) yields total count in the array. sizeof(a/sizeof(a[0]) yields number of integers in the array.

https://stackoverflow.com/questions/37538/how-do-i-determine-the-size-of-my-array-in-c

When passing arrays of variable by pointer to functions you have to determine the max number of elements in the array to prevent indexing a pointer where you do not want to. Indexing past the end of an array in C using a pointer is a problem in C requiring precautions. C is pointers.

sizeof() has a number of uses. String processing. With the amount of PC memory tyese days I doughty dynamic memory allocation has a lot of use. It used to be when doing a lot of work with large arrays you had to swap them in and out of memory.
 
Working my way up. A bubble sort.

#define TRUE 1
#define FALSE 0
#define FOREVER 1

void bubble_sort(int *list[]){

// sorts a list of integers in ascending order

int list_length, j, swap_flag, temp;
list_length = sizeof(list)/sizeof(list[0]);

while(FOREVER){

swap_flag = FALSE;

for(k = 0; k < list_length -2, k++){

if(list[k] > list[k+1]{list[k]){temp = list[k]; list[k] = list[k+1]; list[k+1] =temp; swap_flag = TRUE;}

} //for

if(swap_flag == FALSE) break;

} //while
} bibble_sort() ends
 
All variables and function nanmes in C are pointers. In C you can create a pointer and set it to any point in memeory. It allows for less code and faster operation but can be dangerous if used without thought.

When you compile a high level language the compiler does not know where it will go in memory. The compiler creates assembly language code and hhe assembler creates machine binary executable code. Code for an OS like Windows is reloadable. The binary file has a table with relative positions of variables and functions. The absolute agrees the OS places the code at in memory is used to resolve memory locations.

All variables in a high level language are aliases for an absolute memory location when loaded.

https://www.programiz.com/c-programming/c-pointers
 
Finding the occurrences of an item in an integer list, creating a list of locations in the list.

#define ERROR -1

int find_occurences(int target, int *list[]; int *results[]){

// list and results are existing arrays. sizes must match for the state where
//the target exists in the entire list.

lnt k, count, results_length list_length;
list_length = sizeof(list)/sizeof(list[0]);
results_length = sizeof(results)/sizeof(results[0]);
if(list_length != results_length)return(ERROR);
count = 0; //counts number of matches, indexes results array
for(k = 0; k < list_length; k++){if(target == list[k]){results[count++] = k;}//store location of match in results
return(count);
 
sizeof() will get the size of a static array, but not a dynamically-allocated one.

Code:
/*

Tests sizeof()

*/

#include <cstdio>
using namespace std;

#define PRINTSIZE(x) printf("%lu\n",sizeof(x));

int main(void) {

	int val = 0;
	int arsta[] = {1,2,3,4,5,6,7,8,9,10};
	int *ardyn = new int[10];
	
	PRINTSIZE(val)
	PRINTSIZE(arsta)
	PRINTSIZE(ardyn)
	
	return 0;
}
Output:
4
40
8
 
I redid it with better labeling:
Code:
/*

Tests sizeof()

*/

#include <cstdio>
using namespace std;

#define PRINTSIZE(lbl,x) printf("%s: %lu\n",lbl,sizeof(x));

int main(void) {
	
	char val1 = 1;
	short val2 = 2;
	int val3 = 3;
	long val4 = 4;
	long long val5 = 5;
	float val6 = 6;
	double val7 = 7;
	long double val8 = 8;
	int arsta[] = {1,2,3,4,5,6,7,8,9,10};
	int *ardyn = new int[10];
	
	PRINTSIZE("char",val1)
	PRINTSIZE("short",val2)
	PRINTSIZE("int",val3)
	PRINTSIZE("long",val4)
	PRINTSIZE("long long",val5)
	PRINTSIZE("float",val6)
	PRINTSIZE("double",val7)
	PRINTSIZE("long double",val8)
	PRINTSIZE("static array: 10 of int",arsta)
	PRINTSIZE("dynamic array: 10 of int",ardyn)
	
	return 0;
}
Output:
char: 1
short: 2
int: 4
long: 8
long long: 8
float: 4
double: 8
long double: 16
static array: 10 of int: 40
dynamic array: 10 of int: 8

OSX / macOS now runs in 64-bit mode, and its C++ compiler assumes 64-bit mode. It can still run 32-bit apps, though Apple has deprecated them with warnings that they won't run in future versions.
 
Moving on to strings. You need to understand the ASCII format. Characters in strings are normally stored in ASCII format.

https://www.ascii-code.com/

Strings are terminated by the ASCII NULL code 000. Looking at the table it can be seen sections of characters like lower case letters are in increasing values. This means characters can be sorted by a numerical value without a lot of complicated logic. Determining the size of a string.

#define NULL 0x00
#define MAX_STRING_LENGTH 1000
#define STRING_ERROR 1

int string_length(char *str[]){

// NULL terminator counted as a character.

int k;
for(k = 0; k < MAX_STRING_LENGTH; k++){if(str[k] == NULL)return(k + 1);}
return(STRING_ERROR);
}
 
Sorting text strings for the entire chatacter set would require setting up a precedence order. Sorting by just lower case letters is easy. To do a bubble sort on an array of text strings requires a string compare function.

Looking at the table the difference between upper and lower case letters is one bit. One way to sort is to ignore case and convert all characters to upper or lower case.




int string_compare(char s1[]; char s2[]){

// checks to see if s2 is equal to, greater than, or less than s1
// in terms of an ascending character sort
// assumes all lower case or upper case letters

int k, ls1l ls2;

// check for valid strings and equal string lengths
if( (ls1 = string_length(s1)) == STRING_ERROR)return(STRING_ERROR).
if( (ls2 = string_length(s2)) == STRING_ERROR)return(STRING_ERROR).
if(ls1 != ls2)return(STRING_ERROR);


// if first chars do mot math sort by first char
if( s1[k] < s2[k]) return (ABOVE);
if( s1[k] > s2[k]) return (BELOW);

// sort by first mismatch following a match
For(k = 0; k < ls1; k++){
if( (s1[k] == s2[k]) && ( (s1[k + 1] < s2[k + 1]) ) return (ABOVE);
if( (s1[k] == s2[k]) && ( (s1[k + 1] > s2[k + 1]) ) return (BELOW);
}
return(EQUAL);
}

john
jack
josh

sorts to

jack
john
josh
would sor5 to jake
 
Last edited:
Changing text case.

In decimal ASCII upper case characters run from 65 to 90, lower case 97 to 122. The difference between upper and lower case is bit 5. For upper case bit 5 is 0 lower case bit 5 is 1.

C has bit wise logical operators. To set to 1 bit 5 OR the character with the binary mask 00100000 or 0x20 hexadecimal. To clear bit 5 to 0 AND the character with the mask 11011111 binary or 0xdf.

In C & is bitwise AND , | bitwise OR. && logical AND, || logical OR

11111111 & 11011111 = 11011111
00000000 | 00100000 = 00100000

The bitwise OR and AND go bit by bit, so for a 1 bit mask no other bits are affected. Only the mask bit changes, the other bits stay the same.

void upper_case(char *str){

// if char is lower case flip bit 5 to 0

int j = 0;

while(str[j] != NULL){
if( (str[j] >= 97) && (str[j] <= 122) ) {str[j] = str[j] & 0xdf; j++;}
}//wile
} // upper_case() ends

void lower_case(char *str){

// if char is upper case flip bit 5 to 1

int j = 0;

while(str[j] != NULL){
if( (str[j] >= 65) && (str[j] <= 90) ) {str[j] = str[j] | 0x20; j++;}
}//wile
} // lower_case() ends
 
Last edited:
In the Congress as a whole-both House and Senate-the enhanced role of money in the re-election process, coupled with the sharply diminished role for reasoned deliberation and debate, has produced an atmosphere conducive to pervasive institutionalized corruption.
- Al Gore

That is my Al-Gore-ithm
 
In the Congress as a whole-both House and Senate-the enhanced role of money in the re-election process, coupled with the sharply diminished role for reasoned deliberation and debate, has produced an atmosphere conducive to pervasive institutionalized corruption.
- Al Gore

That is my Al-Gore-ithm

Spam a lam a ding dong...wrong forum try politics.
 
Changing text case.

In decimal ASCII upper case characters run from 65 to 90, lower case 97 to 122. The difference between upper and lower case is bit 5. For upper case bit 5 is 0 lower case bit 5 is 1.

C has bit wise logical operators. To set to 1 bit 5 OR the character with the binary mask 00100000 or 0x20 hexadecimal. To clear bit 5 to 0 AND the character with the mask 11011111 binary or 0xdf.

In C & is bitwise AND , | bitwise OR. && logical AND, || logical OR

11111111 & 11011111 = 11011111
00000000 | 00100000 = 00100000

The bitwise OR and AND go bit by bit, so for a 1 bit mask no other bits are affected. Only the mask bit changes, the other bits stay the same.

void upper_case(char *str){

// if char is lower case flip bit 5 to 0

int j = 0;

while(str[j] != NULL){
if( (str[j] >= 97) && (str[j] <= 122) ) {str[j] = str[j] & 0xdf; j++;}
}//wile
} // upper_case() ends

void lower_case(char *str){

// if char is upper case flip bit 5 to 1

int j = 0;

while(str[j] != NULL){
if( (str[j] >= 65) && (str[j] <= 90) ) {str[j] = str[j] | 0x20; j++;}
}//wile
} // lower_case() ends

Don't do this unless you're using a time machine.

The modern world runs on Unicode, not ASCII. When you do this you limit your code to running on our alphabet, it will fail even in many places in Europe because of a few oddball characters that are added.
 
You threw me a curve ball I did not think of. Unicode is a superset of ASCII. If it were not it would have a serious backward compatibility problem.

If you are writing an app what you get is the keyboard scan code. From that you can convert to 8 bit ASCII or 16 bit Unicode. ASCII and Unicode have the same numerical values for the 8 bias the ASCII set.

For a given app ASCII may be fine. For example an embedded system with a keyboard and display interface.

As I said I am doodling to bring some thing's that are fading, not writing production code. In a given app simple text files have use. Unicode was is 16 bit providing more charters and formatting. Not all apps implement the full Unicode.

Thanks for the observation.

- - - Updated - - -

!!!Warning!!!

Use code at your own risk. Code is provided as is with no warranty or guarantee.
 
Speeding up seahing is an ongoing issue.

A simple way is a random search rather than a linear search. It can reduce the average search time.

random_index[] is an array of random integers from 0 to the highest data array index

For(j = 0; j < length, j++{if( data_array[random_index[k]] == target) match_index = random_index[k];

I am working on a binary search algorithm.

https://en.wikipedia.org/wiki/Binary_search_algorithm#Hashing
 
Sorting Algorithm Animations | Toptal -- animations of different sorting algorithms on four kinds of data: random, nearly sorted, reversed, and with few unique. The algorithms:
  • Insertion
  • Selection
  • Bubble
  • Shell
  • Merge
  • Heap
  • Quick
  • Quick3
From that page, the ideal sorting algorithm is:
  • Stable: Equal keys aren’t reordered.
  • Operates in place, requiring O(1) extra space.
  • Worst-case O(n·lg(n)) key comparisons.
  • Worst-case O(n) swaps.
  • Adaptive: Speeds up to O(n) when data is nearly sorted or when there are few unique keys.
It is not possible to have all of those properties.
 
In the Congress as a whole-both House and Senate-the enhanced role of money in the re-election process, coupled with the sharply diminished role for reasoned deliberation and debate, has produced an atmosphere conducive to pervasive institutionalized corruption.
- Al Gore

That is my Al-Gore-ithm

Spam a lam a ding dong...wrong forum try politics.

oh sorry, I couldn't find an Al Gore ithm about programming... although he did invent the internet (in his imagination).
 
You threw me a curve ball I did not think of. Unicode is a superset of ASCII. If it were not it would have a serious backward compatibility problem.

If you are writing an app what you get is the keyboard scan code. From that you can convert to 8 bit ASCII or 16 bit Unicode. ASCII and Unicode have the same numerical values for the 8 bias the ASCII set.

For a given app ASCII may be fine. For example an embedded system with a keyboard and display interface.

As I said I am doodling to bring some thing's that are fading, not writing production code. In a given app simple text files have use. Unicode was is 16 bit providing more charters and formatting. Not all apps implement the full Unicode.

Thanks for the observation.

- - - Updated - - -

!!!Warning!!!

Use code at your own risk. Code is provided as is with no warranty or guarantee.

The thing with Unicode is that there are a lot more uppercase/lowercase pairs than just our 26.

- - - Updated - - -

Speeding up seahing is an ongoing issue.

A simple way is a random search rather than a linear search. It can reduce the average search time.

random_index[] is an array of random integers from 0 to the highest data array index

For(j = 0; j < length, j++{if( data_array[random_index[k]] == target) match_index = random_index[k];

I am working on a binary search algorithm.

https://en.wikipedia.org/wiki/Binary_search_algorithm#Hashing

This will never be faster unless your searches are concentrated towards the tail of the list (and if you know that, linear search the other direction would be faster than this.)
 
Unicode has several implementations. The thread is on coding solutions not debating encoding text.

Take at stab at modifying what I wrote for 16 and 32 bit Unicode.

Python, C, Basic, use what you know.

The way I learned and still do is come up with a problem and figure out hoe to solve it.

I am not set up to-do it right now. It would be easy to create a Radom array of values and create an array or random indexes and see what the average search time, number of trials to find a value, vs a linear search of the data array.
 
Back
Top Bottom