Ye Olde
Talbott Blog

RubyConf 2005 - Morning Day One

San Diego, California

Welcome

David Alan Black

Francis Hwang

Top-to-bottom Testing in Ruby

Examples

Fibonacci

Testing is harder in the real world

  • Complexity
  • External components

The quality elbow

From C2 Wiki. The basic idea is that at some point you hit a point where spending more time (money) does not linearly translate to more quality.

Mocking

How do we inject the Mock? ( mocks and singletons )

  1. Email 1
  1. Email 2
  1. Email 3

As we move along the quality elbow, we have to decide how much trouble mocking is worth. Lafcadio goes a long way toward abstracting you from SQL, which allows easier mocking.

Seniors – unit tests, mocks and the database

Trade-offs

  • Upsides
    • Speed
    • No side-effects and no cleanup
  • Downsides
    • Indirection
    • Possible bugs in mock class
    • Time spent to build mock

You could mock anything!

Dynamicity is your friend

Because Ruby is so dynamic and flexible (duck typing), mock objects are particularly easy and powerful.

You can create utilities to make mocking easier. Francis has created Domain Mock to basically set up default mocks, so that in the usual case, you don’t have to worry about setting things explicitly.

Remember you can re-open classes, and add utilities when the right file is required.

You can override standard library classes so that the mocking is completely transparent to the production code. It’s questionable if this is a “good” idea, since it may cause confusing issues in the tests. See this experimental support in MockFS.

Further improvement

  • Test-centric libraries
  • Domain-specific test languages

Test-centric libraries are libraries more amenable to mock and standard testing. Besides making mocks for user tests, this means not depending on the environment directly.

Feedback

  • DHH: There’s a way to get around the trade-offs of speed and side-effects and still use the database… using transactional fixtures in rails. Basecamp tests run in 60 seconds.
    • (Which seems a little long to me – Nathaniel)
  • Austin Ziegler: The contextual service from the third email example is really just another way of doing dependency injection.
    • FHwang: Yes, ContextualService does something just like Needle, but the differences are stylistic
  • DHH: Thought about doing Rails on Needle, but found a simpler way. Just change the load path so that a mock file gets loaded before/instead of the real file.
    • (Dave Fayram: Rails’ approach is outstanding, when it’s applicable. A defined load order is considered archaic in some languages, but Ruby benefits from it in this case)
  • Aslak: One of the downsides of mocks is you have to write the mocks yourself… there are libraries that will generate mocks for you.
    • FHwang: Complexity of mocks depends on object to be mocked, prefers to write his own in most cases.
  • Aslak – Mocks and stubs are different.
  • ??? – You could use a temp directory instead of MockFS.
    • Ryan Davis: If you use a temp directory, you might not clean up correctly, and then the next test may fail.
  • ??? – Do mocks allow better portability of development? (Sort of rhetorical)
    • FHwang: A lot of times he starts up a new Lafcadio project and doesn’t even have a database for it in the beginning
  • Jon Tirsen: Mocks do not prove the system really works. You really need integration level testing, too.
    • FHwang: “I agree.”
  • Patrick May: Who’s responsibility is it to create the mocks? The library writer, or the person doing the testing?
    • FHwang: It would be nice if library providers would provide mocks)

open-uri, Easy-to-Use and Extensible Virtual File System

Akira Tanaka

Free Software Initiative Group,
Information Technology Research Institute,
National Institute of Advanced Industrial Science and Technology (AIST)
2005-10-14
akr@m17n.org

Table of Contents

  • Who am I?
  • How to use open-uri
  • Why open-uri?
  • open-uri and net/http
  • How to design easy-to-use API
  • Easy-to-use vs. security
  • VFS – Virtual File System

Who am I (1)
The author of open-uri and several standard libraries:

open-uri.rb, pathanme.rb, time.rb, pp.rb, prettyprint.rb, resolv.rb, resolv-replace.rb, tsort.rb

Who am I (2)

Contribution for various classes and methods.

IO without stdio
IO#read and readpartial

Time Time.utc

Regexp…

Process.daemon
fork kills all other threads

Who am I (3)

I report many bugs, over 100/year.

  • core dump
  • test failure
  • build problem
  • mismatch between doc. and imp.
  • etc.

Who am I (4)

I wrote several non-standard libraries.

  • htree

Why open-uri?

Simple Usage

require 'open-uri'

Why open-uri?

  • Easy to use api
  • VFS: Not only HTTP

net/http has Too Many Ways

(shows all the ways you can do a get)

open-uri has Fewer Ways

open(uri) {|f| }
uri.open {|f| }
uri.read
  • Save User’s Memory (Brain Memory)
  • Reuse User’s Knowledge

net/http: get and print

Net::HTTP.get_print(URI("http://host"))
print Net::HTTP.get(URI("http://host"))

Too complicated, too specialized. #get_print is the same size. Convenience methods aren’t so convenient if they don’t make your code shorter.

open-uri: get and print

open("http://host") {|f|
  print f.read
}
print URI("http://host").read

get and print
net/http:
*Net::HTTP.get_print : print only
*Net::HTTP.get : good
open-uri:

Why Easy?

open("http://host')
  • No new construct
  • Users don’t need to learn.

open-uri respects user knowledge.

net/http: headers

  • No URI anymore
  • No Net::HTTP.get anymore
  • Net::HTTP.start, Net::HTTP#get and Net::HTTP::Response#body instead.

open-uri: headers

net/http: SSL

open(“http://host”, “User-Agent” => “bar”) {|f|
p f.content_type
print f.read
}

  • Still URI
  • Still open method
  • Fewer things to learn

net/http: SSL

  • Different library: net/https
  • Net::HTTP.new and Net::HTTP#start
  • Different port
  • Server verification not by default

open-uri: SSL

open("https://host") {|f|
  print f.read
}
  • Still URI
  • Still open method
  • Still verification by default
  • No new library
  • No new methods. Fewer things to learn.

net/http: proxy

  • New method: Net::HTTP.get_proxy (?)

open-uri: proxy

% http_proxy=http://proxy:8080
  • Conventional environment variable supported
  • No new methods. An user might know this already.
  • Fewer things to learn

net/http: basic auth

  • New class: Net::HTTP::Get
  • New method: Net::HTTP#request

open-uri: basic auth

open( “..”,
:http_basic_authentication => [“user”,“pass”] { |fd|

}

  • Still URI
  • Still open method
  • New options: :http_basic_authentication
  • No new methods. Fewer things to learn.

How to Design Easy-to-Use API

  • Save brain power
  • Evolve gradually

Save Brain Power – Fewer things to learn

  • Fewer constructs for pragmatic languages
  • Huffman coding
  • DRY
  • No configuraiton is good configuration
  • Reuse user knowledge
  • Infrastructure friendly

Fewer constructs for Pragmatic Usages

Fewer constructs decrease things to learn

  • open vs. Net::HTTP.get, Net::HTTP#get, etc.
  • This is not minimalism.
  • The target of “fewer” is not all constructs.

Pragmatic usages should be supported by small constructs.

Fewer Constructs (2)

Image – the idea is to support the most common uses with convenience methods and make them easy to use.

Ex. net/http and open-uri

  • Methods frequently used:
    • net/http: Net::HTTP.start, Net::HTTP#get
    • open-uri: open

open-uri’s fewer constructs supports many more features

Huffman Coding

  • Shorter for frequent things
  • Longer for rare things

Optimize for frequent things.
Ex: p

Huffman Coding (2)

Image – the idea is that frequently used methods should be shorter, and rarely used methods should be longer.

Ex. p

p obj
  • Very frequently used
  • Bad name in common sense
  • Almost no problem because everyone knows
    • (you can make things shorter when they’re used more often because the name is less important for figuring out what the method does)

Ex. pp and y

  • Bad name in common sense
  • Problematic than p because not everyone knows

Ex. to_s and to_str

  • to_s: shorter. frequently used.
  • to_str: longer. internal use

Ex. def

  • def: shorter. frequently used.
  • define_method: longer. not encouraged.

Ex. time.rb

Time.parse: frequently used.
Time.strptime: generic. needs to learn the format

Time.parse is less flexible but enough for most cases, and easy to learn

Candidates for Huffman Coding

  • Method name
  • Other name
  • Convenience method
  • Language syntax
  • etc.

Length for Huffman Coding

  • Number of characters
  • Number of nodes in AST
  • Editor keystrokes
  • etc.

Encourage Good Style

  • Programmers like short code
  • Short code should be designed as good style

DRY

DRY Violation

r = h.request(q) print r.body

}

No configuration is good configuration
Things that should work well out-of-box

  • SSL CA Certificates
  • http_proxy environment variable

Bad Examples

  • ext/iconv/config.charset
  • soap_use_proxy
  • require “irb/completion”
  • RUBYOPT=rubygems

Reuse User Knowledge

  • open-uri reuse user knowleged
    • open is used to access an external resource
  • If a block is given for open, it is called with a file object

Various knowledge about open is reused.
Fewer things to learn.

Reusable Knowledge

  • Ruby builtin (popular) method
  • Consistency
  • Unix
  • Standards: POSIX, RFC, etc.
  • Metaphor

Metaphor

  • HTTP is a kind of network file system
  • open-uri doesn’t support beyond file system: POST, etc

Infrastructure friendly

  • emacs, vi
  • line-oriented tools
  • shell and file system
  • web browser

Prefer “It is easy useing the legacy tool XXX” over “It is easy using the new tool YYY

Evolve Gradually

  • Adaptive Huffman coding
  • How to find bad API
  • How to avoid incompatibility
  • Incompatible change

Adpative Huffman Coding

What methods are used frequently?

  • Long method name at first
  • Alias to short name later
  • Define convenience methods for idioms

Adaptive Huffman Coding (2)

  • Short names and operators should be used carefully
  • Use a long name if hesitate
  • Alias not a bad thing (TMTOWTDI)
  • Primitives should have long names
  • Define new method for idiom

Operators

  • CGI$[] and CGI#params. CGI$[] was used unsuitably.
  • Hash#[]
    • primitive: Hash#fetch

How to Find Bad API

  • Repeated surprise
  • Often cannot remember
  • Idiom

Repeated suprise

  • Example
    • Time#utc is destructive
    • Iconv.iconv returns an array
    • etc.

Often cannot remember

Manual is required again and again for same issue

  • RubyUnit
  • optparse

RubyUnit / Test::Unit

Compares both, shows simplicity of test::unit

Doesn’t like the need to have a slash in test/unit, and it’s also a pain to have to extend Test::Unit::TestCase.

optparse

It’s often difficult to remember the various pieces necessary, like ARGV.options and opts.parse!

Idiom

  • Repeated code
  • Violate DRY
  • An idiom may be good
  • an idiom may be bad

Bad idiom example:

  • Iconv.iconv()0

How to Avoid Incompatilbity

Extension without Incompatibility

  • New mthod
  • New keyword argument
  • New constants

(Some stuff was missed in here due to technical difficulties [Network access and SEE crash])

fork: warning afer change

  • Ruby 1.6: No warning
  • Ruby 1.8.0: No warning
  • Ruby 1.8.1: warning: fork terminates thread
  • Ruby 1.8.2: No warning

IO#read: warning before change

IO#read will block even if O_NONBLOCK is set.

(warning changes between versions)

warning only in verbose mode.

Easy-to-Use vs. Security

  • HTTP_PROXY
  • http://user:pass@host/
  • redirection and taint
  • File.open(uri)

VFS – Virtual File System

Why VFS?

Typical simple program:

  • Load an external resource
  • Process the resource
  • Store the result

VFS ease the first step.

What is VFS

VFS and polymorphism

The polymorphism can be implemented by :

Polymorphic open

If open-uri is in effect:

  • open(“http://…”) calls URI.open

Any URI can be opened if the URI has open method.

Other Resources

Other Operations

  • URI.read
  • Other polymorphic operations

Security Considerations

  • open(“|…”)
  • File.open is not affected

Summary

  • How to design Easy-t-Use API
    • Save brain power
    • Evolve gradually
  • VFS by open-uri