Track 1: Managing Your Information

From Asia Source 3

Jump to: navigation, search

Contents

[edit] About this track

As technology continues to pervade human activity, digital information accumulates at a phenomenal rate. These data come from many different sources, and may be structured or unstructured. The challenge now is to find context and relevance, turning this data into meaningful information that stakeholders can use for decision and collaboration.

[edit] Key Touchpoints

  • Structured vs Unstructured Data
  • Information Life Cycle
    • Creation and Transformation
    • Management and Distribution
    • Analysis and Context
    • Maintenance


[edit] Outcomes

  • Share needs, challenges, and best practices in managing information
  • Select open source tools for managing information
  • Gain experience in using the tools
  • Building track community networking
  • Develop strategies to replicate the experience in local communities


[edit] Track Schedule

[edit] Day 1: Getting Started

  • Getting to know each other
  • What do we want out of this track?
  • What do we understand about managing information? (10m)
  • What does FOSS have to do with managing your information? (10m)
  • What challenges do you face in managing information in your organization now? (Breakout: 20m; Consolidation: 40m)

[edit] Hands-on activity: Working with Ubuntu

(Explanation: 10m; Group Work: 60m; Consolidation: 20m)

  • Introduce the participants to Ubuntu using our preinstalled laptops.
  • Get Ubuntu running on their desktops, either through LiveCD, USB pendrive, or Wubi install.
  • Introduce Synaptic as a directory of software; supplement with SourceForge, if possible

[edit] Rationale

Ubuntu provides a good platform for evaluating open source software. Many tools that can be used for managing information are already included in the basic distribution. It also has an extensive list of additional software that can be installed from its Internet repositories.

[edit] Challenge

  • List down all the software that is included in the basic Ubuntu distribution.
  • Find the software for specific needs (to be listed later, as part of the outcome from What Challenges)

[edit] Track reporting

Track session went on with participants introduction by identifying themself with animal names. It turned out to be a good chance for identifying each of them associated with animal personage. It was a short introduction for about 30 participants and facilitators in the track and a sure thing to make the day run smoothly.

At the first session, all the participants wrote their needs and purposes to join the track and put those in sticky notes to be posted on the wall. Those were divided into four subjects: 1. Skill; 2. Software; 3. Theory; 4. Miscellaneous. The results will be the objectives of track one.

Data resourcing was the afternoon session on track one. To give the experiences and good understanding, the participants were divided into four groups. Each group was working about data resource each in a case study.

The last session for the day was practicing the FOSS tools to be used in processing the data. Ubuntu was chosen as the Free and Open Source Software because it contains the applications needed. Since not of all of the participants have FOSS experiences before, some participants were trained on how boot from the live CD and how to install Ubuntu into theirs. (by Night owl)

[edit] Day 2: Creation and Transformation

  • Where does all your data come from? (Breakout: 20m; Consolidation: 30m): Participants will identify all the data sources in their organization, what forms they come in, how they are transformed and used.
  • File Formats and Services: Open vs Proprietary (10m): Asks the participants if they are easily able to convert the data into other forms, if they can easily transfer it to other services; or are they locked in?

[edit] Hands-on activity: Getting to Know OpenOffice.org

(Explanation: 10m; Group Work: 60m; Consolidation: 20m)

[edit] Rationale

OpenOffice.org may look like a basic office suite but it still can be a powerful information management tool. This hands-on focuses on spreadsheets to organize structured information, but with avenues for transforming and embedding into other forms.

[edit] Challenge

  • Spreadsheet challenge: give the students a large spreadsheet and ask them to extract key data from it, turn it into meaningful representations
  • Design a spreadsheet to act as participants database: what information should it contain? what about privacy?
  • Transforming the spreadsheet - embedding it into a presentation and a text document; preparing it for import into a database
  • Strategies for teaching OpenOffice.org to communitiess

[edit] Track reporting

The participants were 28 in total. We missed the Llama and the dog.

Day 2 for track 1 was devoted for creation and transformation. It started with a recap from previous day activities and proceeded with the discussion on FOSS which includes advantages. License of FOSS came up as a topic and to further expound on this, it was requested to be one elective for the Asia Source Camp.

A very exciting activity was when we get to identify the applications we see useful about Ubuntu, as well as the applications that we like to see running under Ubuntu. The top 3 useful applications the track 1 participants loved about Ubuntu were Open Office, Mozilla Firefox and GIMP. Another application in Ubuntu was Synaptic Package Manager, a repository of software. This application can help the user upload and update software of your need. It eliminates the normal process of installing software of clicking next “next” “next”, with Synaptic, installing, finding and updating software becomes easier.

Finally, the last activity was more on the structured data. A calc file was given and the participants need to answer 5 questions. The objective of this activity was to get meaningful information through the data given. Spreadsheet skill was highly needed for this activity and it ended with the presentation of their findings. (by Leopard)

[edit] Day 3: Management and Distribution

  • How do you organize your files? (Breakout: 20m; Consolidation: 30m) - Participants will discuss their best practices in managing their files and other information. Key point: how are these files and information made available to the rest of the organization?
  • Metadata: Data on data (10m)  :Discussion on the importance of metadata, where it is present, available standards

[edit] Hands-on activity: Using MediaWiki

(Explanation: 10m; Group Work: 60m; Consolidation: 20m)

[edit] Rationale

MediaWiki can be used as a simple yet powerful document management system. This session shows how it can be used to manage unstructured information.

[edit] Challenge

  • Creating accounts on MediaWiki
  • Have the students create their personal wiki pages
  • Upload files into MediaWiki, add tags, start discussions

How should we organize our wiki? (Group Discussion: 10m) [Given what we already know about the wiki, what other features would be useful to include?]

[edit] Auxiliary topics

  • GIS [point to Mifan's afternoon topic]
  • LAMP
  • Installation of MediaWiki (our MediaWiki will be preinstalled)

[edit] Track reporting

Our session started with a recap on what we have discussed on Day 2, which are the list of applications in Ubuntu and structured data. Then the facilitators asked us to list down in a piece of paper how we manage our files in our personal desktops and in our organization. This exercise was an introduction to unstructured data.

In an organization, 80% of the data are unstructured data. Unstructured data are photos, documents, media files (video and audio), and emails. Since these unstructured data are so hard to keep track and to manage/organize, the participants discussed on some software applications that can be used like the following:

  • For indexing files in your desktop there are:
  1. Spotlight for Mac users
  2. Tracker and Beagle for Ubuntu users
  3. Google Desktop
  • For sharing and collaboration of files there are:
  1. Google Sites
  2. Google Docs and Google Gear
  • For organizing media files there are:
  1. Picasa
  2. iPhoto for Mac Users
  3. Songbird
  4. Miro
  • For file management there are:
  1. Knowledge Tree
  2. Alfresco

The participants also discussed on how to back-up files on their Desktop since some of the people tend to forget or don't back-up files because its too cumbersome for them to again manage files. Some participants suggested on using the following applications to back-up files:

  1. Ubuntu One
  2. Drop box
  3. Google Docs and Google Gear
  4. Time Machine for Mac Users

After discussing on file back-up, one participant asked how secure the file is when it is back-up in a remote site. When uploading sensitive information to a remote site, the file should be encrypted or password protected. Some suggested on using encryption keys, zip files with password or using data hashing applications to protect the files. However, to much protection of a certain file can have disadvantages such as it might be hard to decrypt the file when needed and encryption technologies are costly. Thus, we must remember the Three (3) Basics of Security:

  1. Confidentiality
  2. Integrity – files are unchanged
  3. Availability – have to be available when needed

After the discussion, we had an activity using Media Wiki. The facilitators taught us how to create a wiki. The participants created accounts on the AS3 Wiki and edited their respective profile wiki page. (by Jollibee)

[edit] Day 4: Analysis and Context

  • How do you maximize your data? (Breakout: 20m; Consolidation: 30m) - Given the data we've generated in the previous days, what additional information can we extract from it? If participant information does not contain enough data, possibly use the world database.
  • Data Mining - Simple data mining from the spreadsheets, other FOSS data mining tools
  • Deep Internet Search - Use archival resources and comparison tools; pull in Bobby to help facilitate

[edit] Hands-on activity: OpenOffice.org Base

(Explanation: 10m; Group Work: 60m; Consolidation: 20m)

[edit] Rationale

OpenOffice.org Base is an often neglected component of the suite, but it gives users database capabilities beyond what a spreadsheet may provide. We will also look into Google docs as a possible data source for structured data.

[edit] Challenge

  • Creating an OO.o database
  • Converting a spreadsheet into an OO.o database
  • OO.o forms
  • Google Docs spreadsheet forms to OO.o database

[edit] Track Reporting

[edit] Day 5: Maintenance

  • Considerations in maintaining data (10m): security, privacy, backup, inconsistencies
  • How do you maintain your data? (Breakout: 20m; Consolidation: 30m): Expand on the earlier discussion, discuss their strategies and best practices]

[edit] Hands-on activities

  • Encrypting data on your hard disk
  • Backing up data
  • Maintaining Versions

We are open to other topics.


[edit] Track Reporting

Report on Day 6, 12 November 2009.

Objective of the final day is to finalize mind-map and consolidate the mind-map that we made yesterday. We were asked to add more contents to track 1 content.

We run down the tools we have learned before. There are many tools that haven’t been settled because of the time constraint. Today we will summarize those tools that are available by default in ubuntu 9.10. The tools are:

Music player --> rhythmbox. We can also add/install more open source application XMMS, ichoose. Internet radio, live streaming radio. We can choose low and high quality stream. So if you can listen to online radio by standard broadband internet.

Podcast with RSS (real simple syndication) is tool to update music content in blog/website. Favorite RSS feed liferea linux feed reader. We can use lastFM to play online radio. There is issue of digital music rights. Jameendo and Magnitude are feature in Rhytmbox which can handle itunes.

You can do stream recording by connect to mic and recording it with audacity.

Player to record youtube: KEEPVID.COM by copy paste the url. You can also record youtube video by using firefox plugin video download helper or tube-TV.

PDF format file is a compression type so you won’t get further if you try to compress it. In windows you have download and install winrar or winzip, in ubuntu the zip program is default in package. For the linux, the compress file are in "tar.gz" and for windows it is ".zip".

If you compress the folder, you automatically compress the whole content. JPEG file can be compress but not much because JPEG originally a compressed file so you can’t compress more. Comparison ratio between original to zip file, depend on the file. We can assign password to zip file in ubuntu.

Resizing pictures will change the quality level and will reduce the size. If you compress pictures, you may loss data. Compress file with GIMP from JPG to JPEG to reduce the quality and color. The DPI will change.

Evolution is email client default in Ubuntu and it comes with some feature like contacts and calendar. We can use evolution for multiple email accounts which provide POP3 and IMAP feature. We can import emails in outlook to evolution also contacts with csv or xml file type. You can backup your email in evolution and restore it for other use. Configure the server client, IMAP and POP3. We can use it for gmail. With IMAP you get a copy what is the server and it will delete file on the server. With POP3 you will keep download the file and delete what is already dowload to email client. You can pull gmail contacts to evolution.

How to make strong password, recommendation: not too short, not too long, min. 8 characters. Do not use words from dictionary, don’t use your username as password. Combination of letter, number and non-character. We can’t use space as password. When choosing password, something you can remember.

Alfresco is application for document management. Media wiki is quick and easy to learn. In windows you can use MS-Sharepoints and open source is Alfresco and Knowledge Tree. There is dual license. Open source and commercial version with support, depend upon your need. Alfresco is java-based and you need to install database beforehand. It should be installed in server. For starting up you will needs approx. 10 minutes. This is not for personal document management but for organization. Some of the feature are blog component, document library and we can add metadata to the document. We can see preview of our document and we can manage the permission of our documents. Alfresco had low standard server. It can run itself without web server you just need database back up.

Can you upload the whole folder? No because you still need to upload individually. Feature to put notes on the file but we can do it in open office. When we click/open the document it will open the preview all the document/pages. You can't edit document in alfresco. Versioning is easier in media wiki because it won’t replace the original file with the same doc name.

There is a program called group-ware which is a repository for organization calendar and video but with no email capability. Compatibility between the old with the newest application is often issue for some microsoft office software but in Open Office it is not an issue.

Ghost is another application that once mention in previous discussion. It is when you make exact copy of data, including the operating system. You can place and use the hardisk to another computer and it will play the same data. We also discuss about hardisk partition. It is better to have some partition where the operating system is installed in one partition and the data stored in other partition. Therefore one can have upgrade or install new or other linux distro in one partition and it will not affect the data.

The facilitator showed on how to filter data using mySQL in command language. There is graphical interface to do it but the important thing is that we can filter and get more out of our data faster than the common way.

Hand-On Activities/Demo

    Demo on how to install and use liferea RSS feed.
    Demo on how to setup and use evolution as mail client and rss feed.
    Demo of using beagle to search through content, name of the file, email file, including chat logs. It is the same as Google desktop.

(by Niken)

Personal tools