Search found 380 matches

by Provoni
Jul 15, 2020 17:14
Forum: Tips and Tricks
Topic: Fast text search (hash table / GLib)
Replies: 19
Views: 1779

Re: Fast text search (hash table / GLib)

Makoto WATANABE wrote:Dear Provoni;
I want to implement an associative array.

Why do you need that? What are you planning to do?
by Provoni
Jul 15, 2020 17:11
Forum: Community Discussion
Topic: Rule 30
Replies: 3
Views: 408

Re: Rule 30

@integer, that is another rule number. Update, arrays should be in bound and added support for colors + some optimization: 'Stephen Wolfram rule 30 with $30,000 in prizes: https://writings.stephenwolfram.com/2019/10/announcing-the-rule-30-prizes/ '- Problem 1: Does the center column always remain no...
by Provoni
Jul 11, 2020 9:16
Forum: Community Discussion
Topic: Rule 30
Replies: 3
Views: 408

Rule 30

This program generates Stephen Wolfram's rule 30. It seems that the general conjecture is that the center column may act as a stream of random bits. Stephen offers $30,000 in prizes for people who can answer problems related to rule 30: https://www.rule30prize.org/ 'Stephen Wolfram rule 30 with $30,...
by Provoni
Jul 05, 2020 7:40
Forum: Projects
Topic: FbEdit
Replies: 963
Views: 165115

Re: FbEdit

In the Functions panel, when one left-clicks on the tab itself, a menu will pop up with "Highlight -> Update" which will color shared variables, subs and types. I find this extremely useful but cannot find the option to enable this by default and have to constantly re-enable it. Anyone kno...
by Provoni
Jul 04, 2020 6:15
Forum: Community Discussion
Topic: Getting old
Replies: 39
Views: 1485

Re: Getting old

I am trying to think about what random numbers are. What is the opposite of random numbers? I would like to say sequential numbers but not sure. "1234567..." And how is randomness measured? It seems so simple to measure sequential numbers (+1) but up to infinitively hard to measure random ...
by Provoni
Jul 03, 2020 14:33
Forum: Tips and Tricks
Topic: Fast text search (hash table / GLib)
Replies: 19
Views: 1779

Re: Fast text search (hash table / GLib)

Makoto, what is it that you actually want to do?
by Provoni
Jul 03, 2020 14:28
Forum: Community Discussion
Topic: Getting old
Replies: 39
Views: 1485

Re: Getting old

I am going to leave my body to medical science but I can imagine a conversation like: "His brain is in good shape but as for the rest of him I think the furnace beckons, don't you?". His colleague may remark: "I am inclined to agree - let's have his brain out". Perhaps it is bet...
by Provoni
Jun 20, 2020 10:56
Forum: General
Topic: Duplicates
Replies: 34
Views: 1252

Re: Duplicates

Don't think my idea will work.
by Provoni
Jun 20, 2020 10:19
Forum: General
Topic: Duplicates
Replies: 34
Views: 1252

Re: Duplicates

8 buckets: For every new bucket I add a unique character to the string that is otherwise unused through the corpus. Okay or not? EDIT: Not okay, there was an error in my application of the idea. screenres 640,480,32 dim as uinteger i,j,b,l dim as string s dim as double t=timer redim shared as ubyte ...
by Provoni
Jun 20, 2020 9:56
Forum: General
Topic: Duplicates
Replies: 34
Views: 1252

Re: Duplicates

I just realized that multiple CRC32 buckets can be used like so: if crc32array(0,j)=0 then 'unique crc32array(0,j)=1 print #2,s else 'collision, go to 2nd bucket print #3,s j=crc32(s+"*") 'unused character in corpus if crc32array(1,j)=0 then 'unique crc32array(1,j)=1 print #2,s else 'colli...
by Provoni
Jun 20, 2020 7:53
Forum: General
Topic: Duplicates
Replies: 34
Views: 1252

Re: Duplicates

Can anyone make a CRC-n? For example CRC33 or CRC35.
by Provoni
Jun 20, 2020 7:29
Forum: General
Topic: Duplicates
Replies: 34
Views: 1252

Re: Duplicates

Can't keep up with all the information but thanks allot! p.p.s. when replying please specify if this is for a one-off case, or that you have to do this e.g. every month/year etc for new measurement data or so. Maybe about 1-3 times a year. I am creating letter n-grams frequencies for my solver http:...
by Provoni
Jun 18, 2020 19:01
Forum: General
Topic: Duplicates
Replies: 34
Views: 1252

Re: Duplicates

Thanks everyone for all the replies. Still need to catch up. Isn't a checksum just a 'low quality' hash? No clue, it could be. So we’re talking roughly a billion lines of text? One question that occurs to me is, how many unique lines are there likely to be? I never counted the lines, will follow up ...
by Provoni
Jun 16, 2020 17:48
Forum: General
Topic: Duplicates
Replies: 34
Views: 1252

Re: Duplicates

Looks like a case for hashing Thanks jj2007. I was thinking of using a checksum for each line of text. Hashing is the transformation of a string of characters into a usually shorter fixed-length value or key that represents the original string. Hashing is used to index and retrieve items in a datab...
by Provoni
Jun 15, 2020 16:46
Forum: General
Topic: Duplicates
Replies: 34
Views: 1252

Duplicates

Hey all,

I have a very large text file (near 1 TB) and on each line there is some text with a minimum length of twenty bytes and a maximum length of perhaps thousands of bytes. From this file I want to remove the duplicates entries. How would one approach this in FreeBASIC?

Thanks

Go to advanced search