Count words in a given string
February 13, 2013
Given a string, count number of words in it. The words are separated by following characters: space (‘ ‘) or new line (‘\n’) or tab (‘\t’) or a combination of these.
There can be many solutions to this problem. Following is a simple and interesting solution.
The idea is to maintain two states: IN and OUT. The state OUT indicates that a separator is seen. State IN indicates that a word character is seen. We increment word count when previous state is OUT and next character is a word character.
/* Program to count no of words from given input string. */
#include <stdio.h>
#define OUT 0
#define IN 1
// returns number of words in str
unsigned countWords(char *str)
{
int state = OUT;
unsigned wc = 0; // word count
// Scan all characters one by one
while (*str)
{
// If next character is a separator, set the state as OUT
if (*str == ' ' || *str == '\n' || *str == '\t')
state = OUT;
// If next character is not a word separator and state is OUT,
// then set the state as IN and increment word count
else if (state == OUT)
{
state = IN;
++wc;
}
// Move to next character
++str;
}
return wc;
}
// Driver program to tes above functions
int main(void)
{
char str[] = "One two three\n four\nfive ";
printf("No of words: %u\n", countWords(str));
return 0;
}
Output:
No of words: 5
Time complexity: O(n)
This article is compiled by Narendra Kangralkar. Please write comments if you find anything incorrect, or you want to share more information about the topic discussed above.
this code does not work when we have just one word without any spaces or new lines.
Eg:
"testing"
it gives answer as 0.
could you post the source code that would handle all the cases?
It works. See http://ideone.com/J5AbB2
Python code with o(n) time complexity....
def countingWordsInString(s): wordsCount = 0 list1 = list(s) for i in range(len(list1)): if list1[i] != " " and list1[i] != "\n" and list1[i] != "\t": continue elif i != 0 and list1[i-1] != " " and list1[i-1] != "\n" and list1[i-1] != "\t": wordsCount +=1 if i == len(list1)-1: break if list1[-1] != " " and list1[-1] != "\n" and list1[-1] != "\t": return wordsCount+1 return wordsCount def main(): string =" " wordsCount =countingWordsInString(string) print wordsCount if __name__ == '__main__': main()Another solution...this is also O(n).
int isInvalid(char* p){ if((*p == ' ')||(*p == '\n')||(*p == '\t')) return 1; return 0; } int getWordCount(char* string){ int count = 0; while(*string != '\0') { while(isInvalid(string) == 1) string++; if(*string == '\0') break; count++; while(isInvalid(string) == 0) string++; } return count; }super solution...mind blowing..out of this world!!!
elegant!!!
Small and simple, nice!!
shouldn't the loop in above algo be like this :
while(*string != '')
{
while(isInvalid(string) == 1)
string++;
if(*string == '')
break;
count++;
/* Additional check on *string != '' */
while((isInvalid(string) == 0) && (*string != ''))
string++;
}
sorry not the above one this one :
shouldn't the loop in above algo be like this :
while(*string != '')
{
while(isInvalid(string) == 1)
string++;
if(*string == '')
break;
count++;
/* Additional check on *string != '' */
while((isInvalid(string) == 0) && (*string != ''))
string++;
}
'' denotes string terminator.
@micheal...........supercode............
but isnt the cmplexity f ds prblem accrdng to your solution become n^2 theoritically??
[swe onourcecode language="C"]
/* Paste your code here (You may delete these lines if not writing code) */
[/sourcecode]
@Amateur: i think its O(n) , each element is accessed only once.
@he-man..........yes each elemnt s accesd only once.....its true..
bt theoritically if u have a glance at the algorith fr d frst tym..it wud luk lyk n^2 as der s a nested loop........
The easiest way tho think about iterations, is how many subproblems you are solving for each element. In this case you are only solving 1 problem (the comparing part) for each character. Hence O(n)
PS: Very elegant solution.
This is nice. its working for all conditions.
@Sandeep Jain : It doesn't work. http://ideone.com/4hQNtX (see input/output)
There is no scanf statement. Your input is not taken. The output is only for "HE HE" which is correct.
int main()
{
{
char s1[100]="One two three\n four\nfive ";;
int i,j,k,count=0,f=0;
//printf("Enter String:\n");
//gets(s1);
for(i=0;i
if(s1[i]==' ' || s1[i]=='\n' || s1[i]=='\t')
{
f=0;
}
else if(f==0)
{
count++;
f=1;
}
}
printf("\nThe no of words are:%d",count);
return 0;
}
I think if the words are separated by multiple spaces the algorithm won't work.
I got it, very interesting solution...nice one..........
@GeeksForGeeks:
we have to handle 1 more condition for following input
char str[] = " One two three\n four\nfive "
means i am trying to say if a string is starting with a number of white characters like
char str[] = " One two three\n four\nfive "
Please take a closer look. This is also handled in the given program.