• Welcome to the new Internet Infidels Discussion Board, formerly Talk Freethought.

Implementation of quantifiers (all, any) in programming languages

Jokodo

Veteran Member
Joined
Dec 28, 2010
Messages
4,775
Location
Riverside City
Basic Beliefs
humanist
What's the cause for quantificational operators operating the way they do on empty lists?

Note that I'm asking for the cause, not the reason. If you wish, the implementation. I'm fully aware that any(<empty list>) should return FALSE and all(<empty list>) TRUE, not least because you want to make sure that ALL x: NOT x and NOT(ANY(x)) return the same value. But it doesn't seem to follow from how I would naively go about building those Quantifiers from unary or binary operators like NOT/AND/OR. If you think of ALL as a string of ANDs (a AND b AND c ...), it would seem more natural to end up casting an empty list to FALSE (just as a sequence of OR connectives for ANY).

Do those quantifiers explicitly treat empty lists separately to ensure consistent behaviour to specification (e. g. "IF list_is_empty(); then return TRUE; else apply_normal_logic(); FI" for ALL), or are they implemented in a way very different from what I'd naively expect?
 
I was pretty much C/C++ which is operator rich, so I don't know what you men by operator and normal logic. It has been a long time, but here goes. off the top of my head.

There are many ways to code. Some is a matter or personal style, some is just bad code. It is all up to the coder. A language like C is considered a low level language in that it is rich in low level operations and operators that can be used to code an algorithm. On the other end are languages like BASIC which do not allow as much freedom. It is a two edged sword. In a language like C you can be creative and write code in a lot fewer lines than BASIC, but it is easier to get into trouble.

Scroll down to the logical operators. It is all 'normal logic'.

https://en.wikipedia.org/wiki/Operators_in_C_and_C++

int TRUE = 1,
int FALSE = 0;

int empty_list_test(int_num_lists, *ponter_to_lists){

// pointer_to_lists specifies the lists to be checked
// num_lists specifies number of lists

int empty_list flag = 0; // icounter to count number of non empty lists
int i = 0;
int EMPTY = 0;

for(i = 0, i < num_lists. i++){

..code to check for empty list...

If (list != empty) empty_list_flag += 1;

} // for

If(empty_list_flag == EMPTY) return(TRUE);

return(FALSE);

} //empty_list_test()

void main(void){


If( empty_list_test(n_lists, ptr_array()) == TRUE) printf ("/n All Lists Empty); else printf("/n All Lists Not Empty").

} //main() ends

As an alternative I'd return the number of non empty lists instead of a binary result. Or a pointer array to non empty lists.

If the list is an array of character strings like names the empty test could be done with string functions.

In C there a a number of functions to evaluate strings, lists being an array of strings. There are usually much easier ways to do things rather than a long series of explicit logical chains.

https://www.tutorialspoint.com/cprogramming/c_strings.htm
 
What's the cause for quantificational operators operating the way they do on empty lists?

Note that I'm asking for the cause, not the reason. If you wish, the implementation. I'm fully aware that any(<empty list>) should return FALSE and all(<empty list>) TRUE, not least because you want to make sure that ALL x: NOT x and NOT(ANY(x)) return the same value. But it doesn't seem to follow from how I would naively go about building those Quantifiers from unary or binary operators like NOT/AND/OR. If you think of ALL as a string of ANDs (a AND b AND c ...), it would seem more natural to end up casting an empty list to FALSE (just as a sequence of OR connectives for ANY).

Do those quantifiers explicitly treat empty lists separately to ensure consistent behaviour to specification (e. g. "IF list_is_empty(); then return TRUE; else apply_normal_logic(); FI" for ALL), or are they implemented in a way very different from what I'd naively expect?

I have no idea how they are actually implemented but I would be inclined to implement them thus:

Any:
for each item in list:
if match return true
return false

All:
for each item in list:
if not match return false
return true

Note that this will get the empty list behavior you describe.
 
What's the cause for quantificational operators operating the way they do on empty lists?

Note that I'm asking for the cause, not the reason. If you wish, the implementation. I'm fully aware that any(<empty list>) should return FALSE and all(<empty list>) TRUE, not least because you want to make sure that ALL x: NOT x and NOT(ANY(x)) return the same value. But it doesn't seem to follow from how I would naively go about building those Quantifiers from unary or binary operators like NOT/AND/OR. If you think of ALL as a string of ANDs (a AND b AND c ...), it would seem more natural to end up casting an empty list to FALSE (just as a sequence of OR connectives for ANY).

Do those quantifiers explicitly treat empty lists separately to ensure consistent behaviour to specification (e. g. "IF list_is_empty(); then return TRUE; else apply_normal_logic(); FI" for ALL), or are they implemented in a way very different from what I'd naively expect?

I have no idea how they are actually implemented but I would be inclined to implement them thus:

Any:
for each item in list:
if match return true
return false

All:
for each item in list:
if not match return false
return true

Note that this will get the empty list behavior you describe.

Hmm... now I want to do some benchmarking with long lists where the first match/mismatch is towards the end. Maybe also comparing the semantically equivalent structures

not any(<list>)
vs.
all([not x for x in <list>])

Add: I did play around a bit on the python console. Anything more sophisticated will have to wait for another day (past 11pm here).
It appears that not(any(LIST)) is significantly faster.

Code:
>>> import timeit
>>> setup = """
... longlist = [0 for i in range(1000000)] # create a list of 1M 0s
... longlist.append(1) # append a 1 in position 1000000
... longlist.extend(longlist) # add that list to itself
... """  # this is the part of the code that will be executed outside of the actual benchmarking
>>> # a generator function, should be the fastest way to implement it this way around:
>>> timeit.timeit('all((not x for x in longlist))', setup=setup, number=100)
2.9960989952087402
>>> timeit.timeit('not any(longlist)', setup=setup, number=100)
0.5046029090881348
>>> # this creates a new list with 2000002 entries before it does anything else, expected to be slow:
>>> timeit.timeit('all([not x for x in longlist])', setup=setup, number=100)
5.008944034576416
 
Actually, that was pretty meaningless. Did I mention it's late here?

The slow part isn't the ALL, the slow part is the negating each element.

If we add the line 'reversed_list = [not x for x in longlist]' to the setup, `all(reversed_list)` is actually 20% faster than `not any(longlist)` with the original version. And the ratio stays nearly constant independent of the size of the list, so it's definitely not the onetime cost of negating the boolean.

Python any is slower than all when the first element that makes it true (or false in the case of all) is deep inside a long list, but they perform the same when the decision can only be made at the last element.
 
What's the cause for quantificational operators operating the way they do on empty lists?

Note that I'm asking for the cause, not the reason. If you wish, the implementation. I'm fully aware that any(<empty list>) should return FALSE and all(<empty list>) TRUE, not least because you want to make sure that ALL x: NOT x and NOT(ANY(x)) return the same value. But it doesn't seem to follow from how I would naively go about building those Quantifiers from unary or binary operators like NOT/AND/OR. If you think of ALL as a string of ANDs (a AND b AND c ...), it would seem more natural to end up casting an empty list to FALSE (just as a sequence of OR connectives for ANY).

Do those quantifiers explicitly treat empty lists separately to ensure consistent behaviour to specification (e. g. "IF list_is_empty(); then return TRUE; else apply_normal_logic(); FI" for ALL), or are they implemented in a way very different from what I'd naively expect?

It would probably help you you narrowed down your programming language. I doubt there is any correct, general answer. Is this even consistent behavior across programming languages that do provide a `any` and `all` functions?

- - - Updated - - -

Actually, that was pretty meaningless. Did I mention it's late here?

The slow part isn't the ALL, the slow part is the negating each element.

If we add the line 'reversed_list = [not x for x in longlist]' to the setup, `all(reversed_list)` is actually 20% faster than `not any(longlist)` with the original version. And the ratio stays nearly constant independent of the size of the list, so it's definitely not the onetime cost of negating the boolean.

Python any is slower than all when the first element that makes it true (or false in the case of all) is deep inside a long list, but they perform the same when the decision can only be made at the last element.

Yes, because `any` and `all` short-circuit in Python.


If (C)Python is what you are interested in, here is the implementations:

https://github.com/python/cpython/b...555ce87a5f6cbc876f0/Python/bltinmodule.c#L328


Essentially, it's the algorithm described by Loren's pseudocode. I suspect most languages implement it that way.



As an aside, I would keep in mind that iterating over generators requires a *lot* of overhead, *especially* compared to iterating over lists. To compare apples to apples, I would wrap your arguments to `any` and `all` in generator expressions to get a better sense of the actually differences.
 
The speed problem with scripted languages is that they are generally interpreted not compiled code, they are inherently slow. There is a software layer between scripted code and actual processor machine code that incurs a speed penalty.

They are good for some things. If you need to routinely process a high number of byte by byte comparisons of text lists then a more efficient language is better. If you want absolute fastest speed then parts of the code could be done directly in assembly language.

Same is true for math like Fast Fourier Transforms which require a large number of floating point multiplications or large matrix operations.

Scroll down to searching strings functions. Searching strings for substrings and compare strings.

http://www.cplusplus.com/reference/cstring/

The cause is the way the team that wrote Python decided how to accomplish a task. Woulda shoulda coulda of language constraints is pointless once you are into coding..

Implementing high level abstractions from primitive functions like and/or like an 'all function' is what software engineers do. Typically You would write and test your special function, compile it to a library, and link your main code to the library.

C through the use of memory pointers allows you direct access to memory which traslates to high speed. Compilers typically have speed optimization functions that can speed up iterative loops.
 
Last edited:
The speed problem with scripted languages is that they are generally interpreted not compiled code, they are inherently slow. There is a software layer between scripted code and actual processor machine code that incurs a speed penalty.

They are good for some things. If you need to routinely process a high number of byte by byte comparisons of text lists then a more efficient language is better. If you want absolute fastest speed then parts of the code could be done directly in assembly language.

Same is true for math like Fast Fourier Transforms which require a large number of floating point multiplications or large matrix operations.

Scroll down to searching strings functions. Searching strings for substrings and compare strings.

http://www.cplusplus.com/reference/cstring/

The cause is the way the team that wrote Python decided how to accomplish a task. Woulda shoulda coulda of language constraints is pointless once you are into coding..

Implementing high level abstractions from primitive functions like and/or like an 'all function' is what software engineers do. Typically You would write and test your special function, compile it to a library, and link your main code to the library.

C through the use of memory pointers allows you direct access to memory which traslates to high speed. Compilers typically have speed optimization functions that can speed up iterative loops.

This thread is not about the benefits of interpreted vs compiled languages.
 
The speed problem with scripted languages is that they are generally interpreted not compiled code, they are inherently slow. There is a software layer between scripted code and actual processor machine code that incurs a speed penalty.

They are good for some things. If you need to routinely process a high number of byte by byte comparisons of text lists then a more efficient language is better. If you want absolute fastest speed then parts of the code could be done directly in assembly language.

Same is true for math like Fast Fourier Transforms which require a large number of floating point multiplications or large matrix operations.

Scroll down to searching strings functions. Searching strings for substrings and compare strings.

http://www.cplusplus.com/reference/cstring/

The cause is the way the team that wrote Python decided how to accomplish a task. Woulda shoulda coulda of language constraints is pointless once you are into coding..

Implementing high level abstractions from primitive functions like and/or like an 'all function' is what software engineers do. Typically You would write and test your special function, compile it to a library, and link your main code to the library.

C through the use of memory pointers allows you direct access to memory which traslates to high speed. Compilers typically have speed optimization functions that can speed up iterative loops.

This thread is not about the benefits of interpreted vs compiled languages.

OK. Then what is your problem?

Are you just venting at the fact that Python does not have an easy solution for you or do you have a specific need you want help with? Seems like implementing your 'all' function is straightforward.
 
What's the cause for quantificational operators operating the way they do on empty lists?

Note that I'm asking for the cause, not the reason. If you wish, the implementation. I'm fully aware that any(<empty list>) should return FALSE and all(<empty list>) TRUE, not least because you want to make sure that ALL x: NOT x and NOT(ANY(x)) return the same value. But it doesn't seem to follow from how I would naively go about building those Quantifiers from unary or binary operators like NOT/AND/OR. If you think of ALL as a string of ANDs (a AND b AND c ...), it would seem more natural to end up casting an empty list to FALSE (just as a sequence of OR connectives for ANY).

Do those quantifiers explicitly treat empty lists separately to ensure consistent behaviour to specification (e. g. "IF list_is_empty(); then return TRUE; else apply_normal_logic(); FI" for ALL), or are they implemented in a way very different from what I'd naively expect?

I have no idea how they are actually implemented but I would be inclined to implement them thus:

Any:
for each item in list:
if match return true
return false

All:
for each item in list:
if not match return false
return true

Note that this will get the empty list behavior you describe.

Hmm... now I want to do some benchmarking with long lists where the first match/mismatch is towards the end. Maybe also comparing the semantically equivalent structures

not any(<list>)
vs.
all([not x for x in <list>])

Add: I did play around a bit on the python console. Anything more sophisticated will have to wait for another day (past 11pm here).
It appears that not(any(LIST)) is significantly faster.

Code:
>>> import timeit
>>> setup = """
... longlist = [0 for i in range(1000000)] # create a list of 1M 0s
... longlist.append(1) # append a 1 in position 1000000
... longlist.extend(longlist) # add that list to itself
... """  # this is the part of the code that will be executed outside of the actual benchmarking
>>> # a generator function, should be the fastest way to implement it this way around:
>>> timeit.timeit('all((not x for x in longlist))', setup=setup, number=100)
2.9960989952087402
>>> timeit.timeit('not any(longlist)', setup=setup, number=100)
0.5046029090881348
>>> # this creates a new list with 2000002 entries before it does anything else, expected to be slow:
>>> timeit.timeit('all([not x for x in longlist])', setup=setup, number=100)
5.008944034576416

Where you might find a match is unknown and the lists are unordered. There's no reason to do anything other than a linear traverse.

- - - Updated - - -

Actually, that was pretty meaningless. Did I mention it's late here?

The slow part isn't the ALL, the slow part is the negating each element.

If we add the line 'reversed_list = [not x for x in longlist]' to the setup, `all(reversed_list)` is actually 20% faster than `not any(longlist)` with the original version. And the ratio stays nearly constant independent of the size of the list, so it's definitely not the onetime cost of negating the boolean.

Python any is slower than all when the first element that makes it true (or false in the case of all) is deep inside a long list, but they perform the same when the decision can only be made at the last element.

See negations in my pseudocode?
 
Hmm... now I want to do some benchmarking with long lists where the first match/mismatch is towards the end. Maybe also comparing the semantically equivalent structures

not any(<list>)
vs.
all([not x for x in <list>])

Add: I did play around a bit on the python console. Anything more sophisticated will have to wait for another day (past 11pm here).
It appears that not(any(LIST)) is significantly faster.

Code:
>>> import timeit
>>> setup = """
... longlist = [0 for i in range(1000000)] # create a list of 1M 0s
... longlist.append(1) # append a 1 in position 1000000
... longlist.extend(longlist) # add that list to itself
... """  # this is the part of the code that will be executed outside of the actual benchmarking
>>> # a generator function, should be the fastest way to implement it this way around:
>>> timeit.timeit('all((not x for x in longlist))', setup=setup, number=100)
2.9960989952087402
>>> timeit.timeit('not any(longlist)', setup=setup, number=100)
0.5046029090881348
>>> # this creates a new list with 2000002 entries before it does anything else, expected to be slow:
>>> timeit.timeit('all([not x for x in longlist])', setup=setup, number=100)
5.008944034576416

Where you might find a match is unknown and the lists are unordered. There's no reason to do anything other than a linear traverse.

- - - Updated - - -

Actually, that was pretty meaningless. Did I mention it's late here?

The slow part isn't the ALL, the slow part is the negating each element.

If we add the line 'reversed_list = [not x for x in longlist]' to the setup, `all(reversed_list)` is actually 20% faster than `not any(longlist)` with the original version. And the ratio stays nearly constant independent of the size of the list, so it's definitely not the onetime cost of negating the boolean.

Python any is slower than all when the first element that makes it true (or false in the case of all) is deep inside a long list, but they perform the same when the decision can only be made at the last element.

See negations in my pseudocode?

That explains why ALL is faster with a deeply embedded offender, but not why they're the same again when the offender is the last element.
 
That explains why ALL is faster with a deeply embedded offender, but not why they're the same again when the offender is the last element.

What exactly do you mean? Loren's pseudo-code explains this, and as I linked to earlier, the CPython source code confirms it's that straightforward implementation. Here are some timings, first the setup:

Code:
In [1]: from timeit import timeit

In [2]: longlist = [*(0 for i in range(1000000)), 1]

In [3]: setup = 'from __main__ import longlist'

In [4]: timeit('any(longlist)', number=100)


And the timings:

Code:
In [5]: timeit('any(x for x in longlist)', number=100, setup=setup)
Out[5]: 4.055047666

In [6]: timeit('all(x for x in longlist)', number=100, setup=setup)
Out[6]: 4.790199999149536e-05

In [7]: timeit('all(not x for x in longlist)', number=100, setup=setup)
Out[7]: 4.451926699999987

In [8]: timeit('any(not x for x in longlist)', number=100, setup=setup)
Out[8]: 4.9196000190931954e-05

This is what one would expect from the following implementation, this time in Python to replicate the algorithms in the CPython source code I linked earlier. So, equivalently:

Code:
In [9]: def my_all(iterable):
   ...:     for element in iterable:
   ...:         if not element:
   ...:             return False
   ...:     return True
   ...:
   ...:

In [10]: def my_any(iterable):
    ...:     for element in iterable:
    ...:         if element:
    ...:             return True
    ...:     return False
    ...:
    ...:

Note, the above are actually straight from the relevant docs.


And timing these:

Code:
In [13]: timeit('my_any(x for x in longlist)', number=100, setup=setup)
Out[13]: 5.239888253999993

In [14]: timeit('my_all(x for x in longlist)', number=100, setup=setup)
Out[14]: 5.5455000051551906e-05

In [15]: timeit('my_all(not x for x in longlist)', number=100, setup=setup)
Out[15]: 5.584722250000027

In [16]: timeit('my_any(not x for x in longlist)', number=100, setup=setup)
Out[16]: 5.5676999863862875e-05

We see the same short-circuiting behavior.
 
That explains why ALL is faster with a deeply embedded offender, but not why they're the same again when the offender is the last element.

What exactly do you mean? Loren's pseudo-code explains this, and as I linked to earlier, the CPython source code confirms it's that straightforward implementation. Here are some timings, first the setup:

Code:
In [1]: from timeit import timeit

In [2]: longlist = [*(0 for i in range(1000000)), 1]

In [3]: setup = 'from __main__ import longlist'

In [4]: timeit('any(longlist)', number=100)


And the timings:

Code:
In [5]: timeit('any(x for x in longlist)', number=100, setup=setup)
Out[5]: 4.055047666

In [6]: timeit('all(x for x in longlist)', number=100, setup=setup)
Out[6]: 4.790199999149536e-05

In [7]: timeit('all(not x for x in longlist)', number=100, setup=setup)
Out[7]: 4.451926699999987

In [8]: timeit('any(not x for x in longlist)', number=100, setup=setup)
Out[8]: 4.9196000190931954e-05

This is what one would expect from the following implementation, this time in Python to replicate the algorithms in the CPython source code I linked earlier. So, equivalently:

Code:
In [9]: def my_all(iterable):
   ...:     for element in iterable:
   ...:         if not element:
   ...:             return False
   ...:     return True
   ...:
   ...:

In [10]: def my_any(iterable):
    ...:     for element in iterable:
    ...:         if element:
    ...:             return True
    ...:     return False
    ...:
    ...:

Note, the above are actually straight from the relevant docs.


And timing these:

Code:
In [13]: timeit('my_any(x for x in longlist)', number=100, setup=setup)
Out[13]: 5.239888253999993

In [14]: timeit('my_all(x for x in longlist)', number=100, setup=setup)
Out[14]: 5.5455000051551906e-05

In [15]: timeit('my_all(not x for x in longlist)', number=100, setup=setup)
Out[15]: 5.584722250000027

In [16]: timeit('my_any(not x for x in longlist)', number=100, setup=setup)
Out[16]: 5.5676999863862875e-05

We see the same short-circuiting behavior.

I understand that. It's just that I got some funny behavior from timeit last time, where any on [*(0 for i in range(1000000)), 1, *(0 for i in range(1000000))] was 20% slower than all on [*(1 for i in range(1000000), 0, *(1 for i in range(1000000))], but they performed exactly the same without the second range argument.

Of course both are superfast when the offender comes early.

I can't seem to replicate it on this machine though so I have to assume I messed up my setup somehow.
 
What's the cause for quantificational operators operating the way they do on empty lists?

Note that I'm asking for the cause, not the reason. If you wish, the implementation. I'm fully aware that any(<empty list>) should return FALSE and all(<empty list>) TRUE, not least because you want to make sure that ALL x: NOT x and NOT(ANY(x)) return the same value. But it doesn't seem to follow from how I would naively go about building those Quantifiers from unary or binary operators like NOT/AND/OR. If you think of ALL as a string of ANDs (a AND b AND c ...), it would seem more natural to end up casting an empty list to FALSE (just as a sequence of OR connectives for ANY).

Do those quantifiers explicitly treat empty lists separately to ensure consistent behaviour to specification (e. g. "IF list_is_empty(); then return TRUE; else apply_normal_logic(); FI" for ALL), or are they implemented in a way very different from what I'd naively expect?
List operations arise naturally from binary operations using a "reduce" or a "fold" of the operation. For the case of the empty list, reduce should always return the identity element. So:

The sum of an empty list is the identity of addition, being 0.
The product of an empty list is the identity of multiplication, being 1.
"Or" of an empty list is the identity of disjunction, being False.
"And" of an empty list is the identity of conjunction, being True.

These are the natural definitions if you think of the primitive recursive definition of reduce. Given a binary operation, you want to say that, in the recursive case, the result of reducing should be the result of applying that binary operation to the first element of the list and the result of reducing the rest of the list. You are now compelled to say that, in the base case, the case of the empty list, the identity of the operation is returned
 
To directly answer the question would require looking at the source code, which is available.

To answer 'all programming languages' would require an understanding of the processor machine code arithmetic/logical instructions, which are fixed, and how the assembler constructs a sequence of machine code instructions from the source code.
 
C++ has a Standard Template Library, and the C++11 version of it includes functions any_of() and all_of() that do "any" and "all".

They are implemented in the fashion that Loren Pechtel describes, with any_of() returning false for an empty list and all_of() returning true for an empty list.
 
Back
Top Bottom