Programming puzzle: highest proportion of 1s in a substring

Jarhyn · Dec 8, 2020

I would say your best bet is to jog your string and find the index of every 0. Then map their distances. You aren't doing an exhaustive search on every count, just looking for the high water mark on distance.

You have to read each value once, at a minimum. Why not just read each value .. exactly once?

For int pos = 0; pos < strlen; pos++{
if c == 0, {
diff = cur-pos
If diff > min && diff > hwm{
Hwm = diff
Longestpos = pos}
Pos = cur}}

Oh, I'm wrong, you can also skip work by jumping FORWARD by HWM at each 0 and then searching backwards for a breaking 0 from that forward point, possibly skipping HWM search bits

So, "11100011110001" could be searched by "111" found, skip forward at 0 by 3, see a 1, if you don't count back to the 0, restart from the first 0 before the 1 (since it cannot contain enough 1s to generate a 3 string, our HWM), and then count forward by 3 again, until we have either beaten our HWM or stepped forward again from a 0.

I'll post a general solver with short circuit look-ahead after work

Jarhyn · Dec 8, 2020

Forgive the capitalized bits .. phone is being Derp

Code:

int first = 0
While(str[first] != '0')
 First++;
Int hwm = first>=(k-1)?first:0;
Int winner = first>=(k-1)?0:-1;
Int jump = k;
Int j;

For(int I = first; I < str.len; I+= jump)
{
  If (hwm != Jump && hwm > k)
    Jump = hwm;
  If (I + jump > len)
    Break;
  //Note, empty loop. This is to calculate j!
  For( j = 0; j < jump && str[I + jump - j]!='0' j++);
  If (j == jump)
  {
    While(len < (I+ jump +  1)&&str[I+jump+1] != '0')
      Jump++;
    //Can test >= for last instance, else > for first instance)
    If (jump > HWM)
    {
      hwm = jump;
      //I = '0', I+1 is our first nonzero!
      Winner = I+1;
    }
    
  }
  Else
    Jump -= j;
}

Or something like this. I haven't tested it.

There is probably a do-while formulation from "first 1", possibly saving some lines of code, but the idea is to jump past impossible segments that cannot contain a new winner.

barbos · Dec 8, 2020

Also, it's obvious that solution must start and end with 1s. This means that you don't need to scan every length=[k,.....,2k-1]
You can find max for some n and then try to extend it with 1s to some length=m>n and if you are successful you can skip these lengths in the search and continue with m+1. Proof is the same as for [k,.....,2k-1]

Jayjay · Dec 8, 2020

barbos said:
Also, it's obvious that solution must start and end with 1s. This means that you don't need to scan every length=[k,.....,2k-1]
You can find max for some n and then try to extend it with 1s to some length=m>n and if you are successful you can skip these lengths in the search and continue with m+1. Proof is the same as for [k,.....,2k-1]

The solution doesn't necessarily have to start and end with a 1. Consider sequence 001100 and k=4.

barbos · Dec 8, 2020

Jayjay said:
barbos said:

Also, it's obvious that solution must start and end with 1s. This means that you don't need to scan every length=[k,.....,2k-1]
You can find max for some n and then try to extend it with 1s to some length=m>n and if you are successful you can skip these lengths in the search and continue with m+1. Proof is the same as for [k,.....,2k-1]

Click to expand...

The solution doesn't necessarily have to start and end with a 1. Consider sequence 001100 and k=4.

It's a degenerate case which does not change the algorithm:
1. Find solution for length=n,
2. expand it to length=m (if it's possible)
3. skip scanning lengths which are m or shorter.

Jayjay · Dec 8, 2020

barbos said:
Jayjay said:

barbos said:

Also, it's obvious that solution must start and end with 1s. This means that you don't need to scan every length=[k,.....,2k-1]
You can find max for some n and then try to extend it with 1s to some length=m>n and if you are successful you can skip these lengths in the search and continue with m+1. Proof is the same as for [k,.....,2k-1]

Click to expand...

The solution doesn't necessarily have to start and end with a 1. Consider sequence 001100 and k=4.

Click to expand...

It's a degenerate case which does not change the algorithm:
1. Find solution for length=n,
2. expand it to length=m (if it's possible)
3. skip scanning lengths which are m or shorter.

Ok, but which one of the solutions of size n do you try to expand? If it's always the last one, it seems rather arbitrary for long inputs. I suppose you could try to expand all of them as you find them and then pick the length of the longest expanded string as m.

barbos · Dec 9, 2020

Jayjay said:
barbos said:

It's a degenerate case which does not change the algorithm:
1. Find solution for length=n,
2. expand it to length=m (if it's possible)
3. skip scanning lengths which are m or shorter.

Click to expand...

Ok, but which one of the solutions of size n do you try to expand? If it's always the last one, it seems rather arbitrary for long inputs. I suppose you could try to expand all of them as you find them and then pick the length of the longest expanded string as m.

Yes, all solutions for n should be tried for expansion. The longest one would obviously have highest density and will allow skipping more scans.

barbos · Dec 9, 2020

I have even better idea for skipping.
Have ratio for the current n-scan and compare to current maximum density record and see if it is even possible to exceed current max density in the next n+1, n+2,.... scan. If it is not possible then one can safely skip it.
It has zero cost over my current code.

barbos · Dec 9, 2020

Here is a code with skipping

Code:

#!/usr/bin/python3
def scan_array(arr,k, sum):
  pos=k-1
  best=sum
  for i in range(k,len(arr)):
    sum+=(arr[i]-arr[i-k])
    if sum>best:
      pos=i
      best=sum
  return pos-k+1, best

def scan_2k(str,k):
  arr=[int(x) for x in str]
  sum=0
  for i in range(k):
    sum+=arr[i]
  k2=min(2*k,len(arr))
  best_density=0.
  ones=0
  for i in range(k,k2):
    ones=ones+1
    b=float(ones)/float(i)
    if b > best_density:
      p, ones=scan_array(arr,i,sum)
      b=float(ones)/float(i)
      if(b>best_density):
        best_density=b
        pos=p
        bl=i
    else:
      print("skipping:", i)
    sum+=arr[i]
  frm="k={} Best={} pos={}"
  print(frm.format(k,best_density, pos))
  print("substring", str[pos:(pos+bl)])

str2="0110111011101111100011"
for i in range(3,10):
  scan_2k(str2, i)

If it reaches density=1 it skips the rest automatically.

Jokodo · Dec 9, 2020

barbos said:

Here is a code with skipping

Code:

#!/usr/bin/python3
def scan_array(arr,k, sum):
  pos=k-1
  best=sum
  for i in range(k,len(arr)):
    sum+=(arr[i]-arr[i-k])
    if sum>best:
      pos=i
      best=sum
  return pos-k+1, best

def scan_2k(str,k):
  arr=[int(x) for x in str]
  sum=0
  for i in range(k):
    sum+=arr[i]
  k2=min(2*k,len(arr))
  best_density=0.
  ones=0
  for i in range(k,k2):
    ones=ones+1
    b=float(ones)/float(i)
    if b > best_density:
      p, ones=scan_array(arr,i,sum)
      b=float(ones)/float(i)
      if(b>best_density):
        best_density=b
        pos=p
        bl=i
    else:
      print("skipping:", i)
    sum+=arr[i]
  frm="k={} Best={} pos={}"
  print(frm.format(k,best_density, pos))
  print("substring", str[pos:(pos+bl)])

str="0110111011101111100011"
for i in range(3,10):
  scan_2k(str, i)

If it reaches density=1 it skips the rest automatically.

Overwriting `str`? cheeky

barbos · Dec 9, 2020

Jarhyn said:
Forgive the capitalized bits .. phone is being Derp

Code:

int first = 0 While(str[first] != '0') First++; Int hwm = first>=(k-1)?first:0; Int winner = first>=(k-1)?0:-1; Int jump = k; Int j; For(int I = first; I < str.len; I+= jump) { If (hwm != Jump && hwm > k) Jump = hwm; If (I + jump > len) Break; //Note, empty loop. This is to calculate j! For( j = 0; j < jump && str[I + jump - j]!='0' j++); If (j == jump) { While(len < (I+ jump + 1)&&str[I+jump+1] != '0') Jump++; //Can test >= for last instance, else > for first instance) If (jump > HWM) { hwm = jump; //I = '0', I+1 is our first nonzero! Winner = I+1; } } Else Jump -= j; }

Or something like this. I haven't tested it.

There is probably a do-while formulation from "first 1", possibly saving some lines of code, but the idea is to jump past impossible segments that cannot contain a new winner.

Guess why I have 0% confidence in your "solution". Even without actually trying to understand it.

barbos · Dec 9, 2020

Jokodo said:
Overwriting `str`? cheeky

Why not?

Jokodo · Dec 9, 2020

barbos said:
Jokodo said:

Overwriting `str`? cheeky

Click to expand...

Why not?

Because next time you want to cast an object to a string via `str(obj)`, it'll complain with `TypeError: 'str' object is not callable`

barbos · Dec 9, 2020

Jokodo said:
barbos said:

Jokodo said:

Overwriting `str`? cheeky

Click to expand...

Why not?

Click to expand...

Because next time you want to cast an object to a string via `str(obj)`, it'll complain with `TypeError: 'str' object is not callable`

I am not going to do that, but if I was going to do that I would change the name after getting such complain.

Jarhyn · Dec 9, 2020

barbos said:
Jarhyn said:

Forgive the capitalized bits .. phone is being Derp

Code:

int first = 0 While(str[first] != '0') First++; Int hwm = first>=(k-1)?first:0; Int winner = first>=(k-1)?0:-1; Int jump = k; Int j; For(int I = first; I < str.len; I+= jump) { If (hwm != Jump && hwm > k) Jump = hwm; If (I + jump > len) Break; //Note, empty loop. This is to calculate j! For( j = 0; j < jump && str[I + jump - j]!='0' j++); If (j == jump) { While(len < (I+ jump + 1)&&str[I+jump+1] != '0') Jump++; //Can test >= for last instance, else > for first instance) If (jump > HWM) { hwm = jump; //I = '0', I+1 is our first nonzero! Winner = I+1; } } Else Jump -= j; }

Or something like this. I haven't tested it.

There is probably a do-while formulation from "first 1", possibly saving some lines of code, but the idea is to jump past impossible segments that cannot contain a new winner.

Click to expand...

Guess why I have 0% confidence in your "solution". Even without actually trying to understand it.

Because you commonly disregard the ideas of others without trying to understand it, In my experience. You didn't have to be an ass.

At any rate, ~~the solution works~~ actually works, now.

returns 8, with a little tweaking for typos that I expected to exist.

Code:

//find longest substring of "1" in a string containing only 1 and 0.
int fn(char* str, int len, int k, char** substr)
{
    int first = 0;
    int found = 0;
    int hwm = 0;
    int winner = -1;
    int jump = k;
    int j;

    //start the process at the head of the string. Why not?
    for(int i = -1; i < len; i+= jump)
    {
        if (i + jump > len)
            break;
           
        jump = hwm > k ? hwm : k;

        //set jump if necessary to the larger size. For next time
        for (j = 0; j < jump && str[i + (jump - j)] != '0'; j++);
       
        if (j + found == jump)
            j += found;

        if (j == jump)
        {
            found = 0;
       
            while (len > (i + jump + 1) && str[i + jump +1]!= '0')
                jump++;
       
            if (jump > hwm)
            {
                hwm = jump;
                *substr = &(str[i+1]);
            }
        }
        else{
            jump -= j;
            found = j;
        }
    }
    return hwm;
}

int main(int argc, char *argv[])
{
    //               123456789012345678901234567890123456
    char string[] = "010100111101010100000111111110101110";
    char* result;
    int hwm = fn(string, 36, 4, &result);
    printf("%d, %s", hwm, result);
   
}

I'll take a look at how it's missing the count: I missed a +1

Jayjay · Dec 9, 2020

Jarhyn said:

barbos said:
Jarhyn said:

Forgive the capitalized bits .. phone is being Derp

Code:

int first = 0 While(str[first] != '0') First++; Int hwm = first>=(k-1)?first:0; Int winner = first>=(k-1)?0:-1; Int jump = k; Int j; For(int I = first; I < str.len; I+= jump) { If (hwm != Jump && hwm > k) Jump = hwm; If (I + jump > len) Break; //Note, empty loop. This is to calculate j! For( j = 0; j < jump && str[I + jump - j]!='0' j++); If (j == jump) { While(len < (I+ jump + 1)&&str[I+jump+1] != '0') Jump++; //Can test >= for last instance, else > for first instance) If (jump > HWM) { hwm = jump; //I = '0', I+1 is our first nonzero! Winner = I+1; } } Else Jump -= j; }

Or something like this. I haven't tested it.

There is probably a do-while formulation from "first 1", possibly saving some lines of code, but the idea is to jump past impossible segments that cannot contain a new winner.

Click to expand...

Guess why I have 0% confidence in your "solution". Even without actually trying to understand it.

Because you commonly disregard the ideas of others without trying to understand it, In my experience. You didn't have to be an ass.

At any rate, ~~the solution works~~ actually works, now.

returns 8, with a little tweaking for typos that I expected to exist.

Code:

//find longest substring of "1" in a string containing only 1 and 0.
int fn(char* str, int len, int k, char** substr)
{
    int first = 0;
    int found = 0;
    int hwm = 0;
    int winner = -1;
    int jump = k;
    int j;

    //start the process at the head of the string. Why not?
    for(int i = -1; i < len; i+= jump)
    {
        if (i + jump > len)
            break;
           
        jump = hwm > k ? hwm : k;

        //set jump if necessary to the larger size. For next time
        for (j = 0; j < jump && str[i + (jump - j)] != '0'; j++);
       
        if (j + found == jump)
            j += found;

        if (j == jump)
        {
            found = 0;
       
            while (len > (i + jump + 1) && str[i + jump +1]!= '0')
                jump++;
       
            if (jump > hwm)
            {
                hwm = jump;
                *substr = &(str[i+1]);
            }
        }
        else{
            jump -= j;
            found = j;
        }
    }
    return hwm;
}

int main(int argc, char *argv[])
{
    //               123456789012345678901234567890123456
    char string[] = "010100111101010100000111111110101110";
    char* result;
    int hwm = fn(string, 36, 4, &result);
    printf("%d, %s", hwm, result);
   
}

I'll take a look at how it's missing the count: I missed a +1

Your solution may work, but it seems to solve a different problem. Also, result is not null terminated so you can't print it out with %s.

Jarhyn · Dec 9, 2020

Jayjay said:

Jarhyn said:

Because you commonly disregard the ideas of others without trying to understand it, In my experience. You didn't have to be an ass.

At any rate, ~~the solution works~~ actually works, now.

returns 8, with a little tweaking for typos that I expected to exist.

Code:

//find longest substring of "1" in a string containing only 1 and 0.
int fn(char* str, int len, int k, char** substr)
{
    int first = 0;
    int found = 0;
    int hwm = 0;
    int winner = -1;
    int jump = k;
    int j;

    //start the process at the head of the string. Why not?
    for(int i = -1; i < len; i+= jump)
    {
        if (i + jump > len)
            break;
           
        jump = hwm > k ? hwm : k;

        //set jump if necessary to the larger size. For next time
        for (j = 0; j < jump && str[i + (jump - j)] != '0'; j++);
       
        if (j + found == jump)
            j += found;

        if (j == jump)
        {
            found = 0;
       
            while (len > (i + jump + 1) && str[i + jump +1]!= '0')
                jump++;
       
            if (jump > hwm)
            {
                hwm = jump;
                *substr = &(str[i+1]);
            }
        }
        else{
            jump -= j;
            found = j;
        }
    }
    return hwm;
}

int main(int argc, char *argv[])
{
    //               123456789012345678901234567890123456
    char string[] = "010100111101010100000111111110101110";
    char* result;
    int hwm = fn(string, 36, 4, &result);
    printf("%d, %s", hwm, result);
   
}

I'll take a look at how it's missing the count: I missed a +1

Your solution may work, but it seems to solve a different problem.

Ah, I see now. I read the problem wrong. Most dense substring of any length, not the longest substring of solid, minimal length k.

Jarhyn · Dec 9, 2020

Jayjay said:

Jarhyn said:

Because you commonly disregard the ideas of others without trying to understand it, In my experience. You didn't have to be an ass.

At any rate, ~~the solution works~~ actually works, now.

returns 8, with a little tweaking for typos that I expected to exist.

Code:

//find longest substring of "1" in a string containing only 1 and 0.
int fn(char* str, int len, int k, char** substr)
{
    int first = 0;
    int found = 0;
    int hwm = 0;
    int winner = -1;
    int jump = k;
    int j;

    //start the process at the head of the string. Why not?
    for(int i = -1; i < len; i+= jump)
    {
        if (i + jump > len)
            break;
           
        jump = hwm > k ? hwm : k;

        //set jump if necessary to the larger size. For next time
        for (j = 0; j < jump && str[i + (jump - j)] != '0'; j++);
       
        if (j + found == jump)
            j += found;

        if (j == jump)
        {
            found = 0;
       
            while (len > (i + jump + 1) && str[i + jump +1]!= '0')
                jump++;
       
            if (jump > hwm)
            {
                hwm = jump;
                *substr = &(str[i+1]);
            }
        }
        else{
            jump -= j;
            found = j;
        }
    }
    return hwm;
}

int main(int argc, char *argv[])
{
    //               123456789012345678901234567890123456
    char string[] = "010100111101010100000111111110101110";
    char* result;
    int hwm = fn(string, 36, 4, &result);
    printf("%d, %s", hwm, result);
   
}

I'll take a look at how it's missing the count: I missed a +1

Your solution may work, but it seems to solve a different problem. Also, result is not null terminated so you can't print it out with %s.

I wasn't looking to snprintf or any of that, I just wanted to see the remainder of the string. I could have done the strtok thing and nulled the character, I guess? I just wanted the index properly calculated.

Interesting to me is the fact that for the case where there are K or more consecutive 1's my solution does solve the problem. I am almost certain both N! And potentially even a linear solution is possible, though. The N! Solution would be (find the maximally dense string of at least K size for each point going forward from index i++), with some early termination possible.

There's a fair bit of work necessary to prove out a linear solution though for a non-consecutive best K+

I'll also note that using (ad>bc) can free the function entirely of division and floating point math in general.

Jayjay · Dec 9, 2020

Jarhyn said:
Jayjay said:

Jarhyn said:

Because you commonly disregard the ideas of others without trying to understand it, In my experience. You didn't have to be an ass.

At any rate, ~~the solution works~~ actually works, now.

returns 8, with a little tweaking for typos that I expected to exist.

Code:

//find longest substring of "1" in a string containing only 1 and 0. int fn(char* str, int len, int k, char** substr) { int first = 0; int found = 0; int hwm = 0; int winner = -1; int jump = k; int j; //start the process at the head of the string. Why not? for(int i = -1; i < len; i+= jump) { if (i + jump > len) break; jump = hwm > k ? hwm : k; //set jump if necessary to the larger size. For next time for (j = 0; j < jump && str[i + (jump - j)] != '0'; j++); if (j + found == jump) j += found; if (j == jump) { found = 0; while (len > (i + jump + 1) && str[i + jump +1]!= '0') jump++; if (jump > hwm) { hwm = jump; *substr = &(str[i+1]); } } else{ jump -= j; found = j; } } return hwm; } int main(int argc, char *argv[]) { // 123456789012345678901234567890123456 char string[] = "010100111101010100000111111110101110"; char* result; int hwm = fn(string, 36, 4, &result); printf("%d, %s", hwm, result); }

I'll take a look at how it's missing the count: I missed a +1

Click to expand...

Your solution may work, but it seems to solve a different problem. Also, result is not null terminated so you can't print it out with %s.

Click to expand...

I wasn't looking to snprintf or any of that, I just wanted to see the remainder of the string. I could have done the strtok thing and nulled the character, I guess? I just wanted the index properly calculated.

Code:

printf("%d, %.*s", hwm, hwm, result);

Jarhyn · Dec 9, 2020

Jayjay said:
Jarhyn said:

I wasn't looking to snprintf or any of that, I just wanted to see the remainder of the string. I could have done the strtok thing and nulled the character, I guess? I just wanted the index properly calculated.

Click to expand...

Code:

printf("%d, %.*s", hwm, hwm, result);

Interesting. I've been writing c++ for 20 years, and this is my first time seeing a length operator for printf. Learn something new every day I guess. For my own defense, I've never needed it, and it's hard to read. That said, I won't be able to start milling out a roll at the non-consecutive case until tomorrow night or maybe lunch tomorrow.

Programming puzzle: highest proportion of 1s in a substring

Wizard

Wizard

Contributor

Contributor

Contributor

Contributor

Contributor

Contributor

Contributor

Veteran Member

Contributor

Contributor

Veteran Member

Contributor

Wizard

Contributor

Wizard

Wizard

Contributor

Wizard