Warm tip: This article is reproduced from serverfault.com, please click

Understand "find maximum occuring char" in a string algorithm

发布于 2020-12-07 10:59:52

I'm learning Data Structures and Algorithms and I've a small problem of understanding the process of find which char occurs the most in a string.

I understand the general goal - having an array that represent the count of a specific char, I obviously understand how to find max in array but I've big problem with this chuck of code(code from https://www.geeksforgeeks.org/return-maximum-occurring-character-in-the-input-string/):

        int count[] = new int[256]; 

          for (int i=0; i<str.length(); i++) 
            count[str.charAt(i)]++; <-- what I don't understand

I'm initializing count array to hold ints, but inside the for loop I'm searhing for specific char in a string, so for example:

count["t"]++

So it basically telling me "give me the value of index "t"? how can I even search with chararter where I should search with index?

Also in kotlin I'm getting expection (count[str.get(i)]) that it's expecting int, not char.

I probably missed fundamental concept that prevent me from understand this, but after short google search I didn't find much.

Questioner
Masksway
Viewed
0
dreamcrash 2021-02-21 04:49:26

Java will convert the char into an int, for instance, 'A' to 65 according to the ASCII table.

enter image description here

As long as your string does not contain a character that would return a value bigger than 255 (e.g., "€"), an array of 256 positions will be enough to map the possible chars into array positions. For instance, for the English alphabet, it would be enough. Nevertheless, since the char in Java is 2 bytes (16 bits), an array of size 65536 (2^16) is enough to be on the safe side. You can also calculate the max int value from all the chars present on that string (assuming a non-null or empty string) and allocate the array accordingly:

int count[] = new int[str.chars().max().orElse(0)+1];

Returning to your question:

count[some_char]++

Converts some_char into an int, and increments by one value on the corresponding array count position.

You can think about this process as a simple hash function that maps 'char' to 'int', and even though it is simple, it perfectly suits the problem at hand since it uniquely maps a given char to a position on that array.

I'm initializing count array to hold ints, but inside the for loop I'm searhing for specific char in a string, so for example:

count["t"]++ So it basically telling me "give me the value of index "t"? how can I even search with chararter where I should search with index?

Be aware that count["t"]++ would give you a compile error, the function str.charAt(i) returns you a char, not a String, hence 't' and not "t".

A running example:

import java.util.Arrays;
import java.util.stream.Collectors;

public class FindMaximumOccurringChar {

    private static int[] countChar(String str) {
        int[] count = new int[str.chars().max().orElse(0) + 1];
        for (int i = 0; i< str.length(); i++)
            count[str.charAt(i)]++;
        return count;
    }

    public static void main(String[] args) {
        String str = "miaumiauuuuu";
        int[] count = countChar(str);
        String str_without_duplicated_char = Arrays.stream(str.split(""))
                .distinct()
                .collect(Collectors.joining());

        for (int i=0; i<str_without_duplicated_char.length(); i++){
            System.out.println("The char '"+str_without_duplicated_char.charAt(i)+"' shows up "
                    + count[str_without_duplicated_char.charAt(i)] +" times");
        }
    }
}

The output:

The char 'm' shows up 2 times
The char 'i' shows up 2 times
The char 'a' shows up 2 times
The char 'u' shows up 6 times