String Sorting in Java

String Sorting in Java

Sorting plays a crucial role in organizing and analyzing data, and Java provides a robust set of tools for sorting various data structures, including strings. String sorting involves arranging the words within a string in a specific order, typically alphabetical.

In this article, we explore different sorting techniques applied to strings in Java. We will look into the implementation of the direct insertion sorting algorithm and more advanced techniques such as the Merge Sort and Quick Sort algorithms. Each approach offers unique advantages by providing Java developers with flexibility in choosing the most suitable method based on the specific requirements of their applications.

String Sorting using the Direct Insertion Algorithm

This Java code sorts a given string containing words in alphabetical order using the direct insertion sorting algorithm. The program first counts the number of words in the input string, then creates an array of strings to store each word separately. Finally, it sorts the array using the direct insertion sorting algorithm and prints the sorted words. The function uses a space character as a separator between words.

import java.util.HashSet;
import java.util.Set;

public class SortWithDirectInsertion {
    public static void main(String[] args) {
        // Declare the String that is to be sorted.
        String text = "MYCPLUS C and C++ Programming Resources was developed after facing difficulties in learning computer programming languages back in university in 2001. The first ever computer programming I learnt was C programming language. The main focus of website is also on C and C++ programming though articles on other programming languages and technologies are also available now.";

        // Determine how many unique words there are
        Set<String> uniqueWords = new HashSet<>();
        int index = 0;
        char separator = ' '; // inter-word separating character

        while ((index = text.indexOf(' ', index)) != -1) {
            index++;
            // Increment the count for each space found, indicating a word boundary
        }

        index = 0;
        int endIndex;
        int count = 0;

        for (int i = 0; i < text.length(); i++) {
            char currentChar = text.charAt(i);
            if (currentChar == separator || i == text.length() - 1) {
                // Found a word boundary or reached the end of the string
                endIndex = (i == text.length() - 1) ? i + 1 : i; // set endIndex to i + 1 if it's the last character
                String word = text.substring(index, endIndex).replaceAll("[^a-zA-Z0-9]+", ""); // remove special characters
                if (!word.isEmpty()) {
                    uniqueWords.add(word.toLowerCase()); // store cleaned up word in lower case
                    count++;
                }
                index = i + 1; // set the next starting index
            }
        }

        // Convert the Set of unique words to an array
        String[] words = uniqueWords.toArray(new String[0]);

        // Sort the substring array using direct insertion
        for (int j = 1; j < words.length; j++) {
            String a = words[j]; // Put the current word in a buffer
            int i = j - 1; // Start compare with the previous word

            while (i >= 0 && words[i].compareToIgnoreCase(a) > 0) {
                words[i + 1] = words[i]; // is greater than i+1th, swap them
                i--;
            }
            words[i + 1] = a; // Put the stored word in the vacant place.
        }

        // Display the sorted array of words
        for (String word : words)
            System.out.println(word);
    }
}

String Sorting using the Merge Sort Algorithm

Merge Sort is a divide-and-conquer algorithm that recursively divides the array into halves, sorts each half, and then merges them back together. Here’s an updated version of the Java program using merge sort.

import java.util.HashSet;
import java.util.Set;

public class SortWithMergeSort {
    public static void main(String[] args) {
        String text = "MYCPLUS C and C++ Programming Resources was developed after facing difficulties in learning computer programming languages back in university in 2001. The first ever computer programming I learnt was C programming language. The main focus of website is also on C and C++ programming though articles on other programming languages and technologies are also available now.";

        Set<String> uniqueWords = new HashSet<>();
        int index = 0;
        char separator = ' ';

        while ((index = text.indexOf(separator, index)) != -1) {
            index++;
        }

        index = 0;
        int endIndex;

        for (int i = 0; i < text.length(); i++) {
            char currentChar = text.charAt(i);
            if (currentChar == separator || i == text.length() - 1) {
                endIndex = (i == text.length() - 1) ? i + 1 : i;
                String word = text.substring(index, endIndex).replaceAll("[^a-zA-Z0-9]+", "");
                if (!word.isEmpty()) {
                    uniqueWords.add(word.toLowerCase());
                }
                index = i + 1;
            }
        }

        String[] words = uniqueWords.toArray(new String[0]);
        mergeSort(words, 0, words.length - 1);

        for (String word : words) {
            System.out.println(word);
        }
    }

    private static void mergeSort(String[] arr, int left, int right) {
        if (left < right) {
            int mid = (left + right) / 2;

            mergeSort(arr, left, mid);
            mergeSort(arr, mid + 1, right);

            merge(arr, left, mid, right);
        }
    }

    private static void merge(String[] arr, int left, int mid, int right) {
        int n1 = mid - left + 1;
        int n2 = right - mid;

        String[] leftArr = new String[n1];
        String[] rightArr = new String[n2];

        System.arraycopy(arr, left, leftArr, 0, n1);
        System.arraycopy(arr, mid + 1, rightArr, 0, n2);

        int i = 0, j = 0, k = left;

        while (i < n1 && j < n2) {
            if (leftArr[i].compareToIgnoreCase(rightArr[j]) <= 0) {
                arr[k] = leftArr[i];
                i++;
            } else {
                arr[k] = rightArr[j];
                j++;
            }
            k++;
        }

        while (i < n1) {
            arr[k] = leftArr[i];
            i++;
            k++;
        }

        while (j < n2) {
            arr[k] = rightArr[j];
            j++;
            k++;
        }
    }
}

String Sorting using the Quick Sort Algorithm

Quick Sort is a divide-and-conquer algorithm that selects a ‘pivot‘ element and partitions the other elements into two sub-arrays according to whether they are less than or greater than the pivot. Here’s the updated version of your program using Quick Sort:

import java.util.HashSet;
import java.util.Set;

public class SortWithQuickSort {
    public static void main(String[] args) {
     String text = "MYCPLUS C and C++ Programming Resources was developed after facing difficulties in learning computer programming languages back in university in 2001. The first ever computer programming I learnt was C programming language. The main focus of website is also on C and C++ programming though articles on other programming languages and technologies are also available now.";

        Set<String> uniqueWords = new HashSet<>();
        int index = 0;
        char separator = ' ';

        while ((index = text.indexOf(separator, index)) != -1) {
            index++;
        }

        index = 0;
        int endIndex;

        for (int i = 0; i < text.length(); i++) {
            char currentChar = text.charAt(i);
            if (currentChar == separator || i == text.length() - 1) {
                endIndex = (i == text.length() - 1) ? i + 1 : i;
                String word = text.substring(index, endIndex).replaceAll("[^a-zA-Z0-9]+", "");
                if (!word.isEmpty()) {
                    uniqueWords.add(word.toLowerCase());
                }
                index = i + 1;
            }
        }

        String[] words = uniqueWords.toArray(new String[0]);
        quickSort(words, 0, words.length - 1);

        for (String word : words) {
            System.out.println(word);
        }
    }

    private static void quickSort(String[] arr, int low, int high) {
        if (low < high) {
            int pi = partition(arr, low, high);

            quickSort(arr, low, pi - 1);
            quickSort(arr, pi + 1, high);
        }
    }

    private static int partition(String[] arr, int low, int high) {
        String pivot = arr[high];
        int i = low - 1;

        for (int j = low; j < high; j++) {
            if (arr[j].compareToIgnoreCase(pivot) <= 0) {
                i++;

                String temp = arr[i];
                arr[i] = arr[j];
                arr[j] = temp;
            }
        }

        String temp = arr[i + 1];
        arr[i + 1] = arr[high];
        arr[high] = temp;

        return i + 1;
    }
}

The output of the Java program is as below.

2001
after
also
and
are
articles
available
back
c
computer
developed
difficulties
ever
facing
first
focus
i
in
is
language
languages
learning
learnt
main
mycplus
now
of
on
other
programming
resources
technologies
the
though
university
was
website

See also: Shell Sort, Quick Sort, Bubble Sort, Insertion Sort, Selection Sort

M. Saqib: Saqib is Master-level Senior Software Engineer with over 14 years of experience in designing and developing large-scale software and web applications. He has more than eight years experience of leading software development teams. Saqib provides consultancy to develop software systems and web services for Fortune 500 companies. He has hands-on experience in C/C++ Java, JavaScript, PHP and .NET Technologies. Saqib owns and write contents on mycplus.com since 2004.
Related Post