Good programmers think before they begin coding. Part I of this assignment involves brainstorming with a group of peers with no computers to talk about a strategy for solving this weeks lab. Breakup into groups based on your seating (3-4 people per group) and brainstorm about how to solve the problems below. Make sure everyone understands the problem and sketch out potential ways to move toward a solution. You may find it helpful to look over the required readings for this week.
Note: Brainstorms are to help you get started with the lab and get you in the habit of designing before you code. You wont submit them to eLC.
Introduction
Twitter and similar micro-blogging platforms allow users to broadcast short messages to a potentially large audience. With Twitter, the messages are called tweets and consist of 140 characters or less. Tweets are often sent from mobile phones, and tweeters can choose whether their tweets are public or not.
The platforms allow people to communicate in ways that were not possible even a few years ago, and they have already had a profound effect on society. They even affect how society reacts to natural and manmade disastersordinary citizens can now broadcast timely and location specific information that is of vital importance to both emergency management agencies and members of the public.
Sifting through the tweets for useful information is difficult, however. As a result, there have been proposals to manually add structure to tweets sent during disasters. One proposal is called Tweak the Tweet (TtT).[1] In TtT, information is marked using a hashtag such as #loc (for location):
MT @carlseelye: #lovelandfire #Wind just switched and now the smoke is thick around our house #loc 40.352,-105.2045
Adding the structure makes it far easier for a computer to classify the tweets.
In this lab, you will use methods of the String class to process messages similar to TtT tweets (the data is made from real TtT data but it has been altered to suit the lab). You will use the substring and other methods to pull out information from the text, manipulate it, and print it out to the screen.
Lab Objectives
By the end of the lab, you should:
- understand how String constants and String objects are represented in Java;
- be able to declare, initialize, and assign variables of type String;
- perform string processing using the methods of the String class;
- concatenate strings (and other values) together using the + operator.
Prerequisites
The lab deals with material from Chapter 2 (specifically, the discussions of the String class). It assumes you know how define simple classes in Java, and that you can declare and assign values to variables.
Exercise Parsing Tweets
The class ParseTheTweet will read in a single tweet from the keyboard, and so when writing your program, you will need to use the Scanner class in addition to the String class. As usual, almost everything you write will go in the main method of the ParseTheTweet class.
The tweets processed by the call all encode the following information using so-called hashtags:
The report type (#typ); some further detail (#det); a location (#loc) such as a street address; and latitude (#lat) and longitude (#lng). The type indicates the meaning of the tweet (i.e., whether it is a request for help or reports factual information), and the report detail provides additional information. Each of the hashtags is followed by some value (such as an actual latitude or longitude value).
When writing your code, you can assume that all of the tweets to process have the following format:
#typ value; #det value; #loc value; #lat value; #lng value;
That is, they consist of a series of hashtags, each followed by a value, and each value followed by a semicolon (;). 9 sample tweets are provided at the end of this document. You should test your code on them.
Instructions
- At the top of your source file, include a comment stating your Java class name, your name, the date, the program purpose, and containing the statement of academic honesty as you did in previous labs.
- Use the Scanner class (as discussed in lecture) to read in a tweet entered by the user and store it in a String variable named tweet.
▪ Hint: Using the Scanner class involves 1) importing a package, 2) creating a
Scanner object, and 3) using the appropriate method to read in a whole line of text.
- You will be splitting up (parsing) the information in the tweet into the 5 different types of information specified by the hashtags (type, location, detail, latitude, & longitude). Declare variables to store each of these pieces of information. Be sure to give these variables the appropriate data types and meaningful names.
- Now you actually need to divide the information in the tweet into the separate substrings of information. Declare 2 variables (start and finish) to hold the indices of where each substring starts and finishes.
▪ Hint: Each substring starts with a hashtag and finishes with a semicolon ( Is there a
String class method that you can use to get the index of #typ or a ; ?)
start finish
#typ value; #det value; #loc value; #lat value; #lng value;
- Once you have numbers assigned to start and finish, we want to discard the #tag and extract only the value. We know that the ; is where the value finishes but we need to find the index where the actual value Hint: our start variable currently points to the index of the # and we know that all hashtag identifiers have the format, hashtag, 3 letters, and a space. Can we do simple math here to figure out the starting position of our value?
- Once we have the correct starting and ending positions for the value, we want to extract the substring at those positions and assign it to the appropriate variable declared in step 3.
- The trim() method removes leading and trailing white spaces (if any) from a String. Use the trim() method on each resultant String.
- Hint: The trim() method returns a modified String, but does not alter the String object that calls it. Ensure you are (re-)assigning a String to the result of trim().
Example
String original = text here ;
String trimmed = original.trim(); //trimmed contains text here //At this point, the original string still contains text here
- After extracting the value encoded by each hashtag, we discard that part of the tweet String we are finished with it and are ready to repeat these steps for the next hashtag. We can use the substring method to extract the substring of our tweet variable starting where the last hashtag finished ( Hint: we know it finishes at a semicolon and we have that index stored in our finish variable, and we want to start right after that semicolon Also, remember that if we pass the substring method only 1 value, it begins at that index and goes until the end of the String).
DISCARD SUBSTRING STORED IN TWEET
- The type values come from a very small set of values, and we want to make them all caps. To do this, use the toUpperCase method of the String
- We also want to ensure that the detail and location values are free of commas (which might pose an obstacle to further processing). Use the replace method of the String class to do this. Replace each comma with a single hyphen (-).
- Now use out and print or println statements to produce formatted output, as shown in the included examples.
HINT: Use the escape sequence t to include tabs as needed.
Additional Requirements
These are things that make the graders lives easier, and ultimately, you will see in the real world as well. Remember the teaching staff does not want to touch your code after they gave you requirements; they want to see the perfect results they asked for! Here is a checklist of things you can lose points for:
- (10 points) If the source file(s)/class(es) are named incorrectly (case matters!)
- (10 points) If your source file(s) have a package declaration at the top
- (10 points) If any source file you submit is missing your Statement of Academic Honesty at the top of the source file. All submitted source code files must contain your Statement of Academic Honesty at the top of each file.
- (10 points) If you have more than one instance of Scanner in your program. Your program should only have one instance of Scanner.
- (15 points) Inconsistent I/O (input/output) that does not match our instructions or examples exactly (unless otherwise stated in an assignments instructions). Your programs I/O (order, wording, formatting, etc.) must match our examples and instructions.
- (100 points) If the source file(s) are not submitted before the specified deadlines late period ends (48 hours after the deadline) or if they do not compile.
- (25 points) Late penalties will be deducted as per the course syllabus.
- If your (10 points) comments or (10 points) variables are lacking o Here, lacking means that you or a TA can find any lines of code or variables that take more than 10 seconds to understand, and there is no comment, or the variable name does not make sense (variable names like b, bb, bbb, etc. will almost never be acceptable)
- (10 points) Indentation is not consistent throughout your source code o Refresh your memory of indentation patterns in chapter 2 in the course textbook
o Be careful of a combination of tabs and spaces in your files (use one or the other)!
If any of the above do not make sense to you, talk to a TA, or ask on Piazza!
Reviews
There are no reviews yet.