planet
August 29, 2010
Jason Dagit
I'm not really sure what motivated this, but I just used cloc to count the lines of code in both the darcs source and the git source. Here are the numbers.The git source tree:
1951 text files.
1836 unique files.
848 files ignored.
http://cloc.sourceforge.net v 1.51 T=15.0 s (72.3 files/s, 20377.3 lines/s)
--------------------------------------------------------------------------------
Language files blank comment code
--------------------------------------------------------------------------------
C 267 15517 13469 100133
Bourne Shell 589 15127 5508 84826
Perl 40 3798 3441 23825
Tcl/Tk 39 1453 375 9762
C/C++ Header 99 1977 3557 8301
make 12 413 434 2673
Bourne Again Shell 1 144 110 2165
Lisp 2 231 170 1779
Python 13 465 442 1384
ASP.Net 8 141 0 931
m4 2 87 21 858
CSS 2 154 24 710
Javascript 2 113 319 477
Assembly 1 26 100 98
XSLT 7 15 29 77
DOS Batch 1 0 0 1
--------------------------------------------------------------------------------
SUM: 1085 39661 27999 238000
--------------------------------------------------------------------------------
The darcs source tree:
561 text files.
549 unique files.
57 files ignored.
http://cloc.sourceforge.net v 1.51 T=189.0 s (2.7 files/s, 298.0 lines/s)
--------------------------------------------------------------------------------
Language files blank comment code
--------------------------------------------------------------------------------
Haskell 169 4361 7374 27760
Bourne Shell 300 2071 2869 8333
C 7 325 153 1494
HTML 5 41 4 316
C/C++ Header 12 92 83 308
Bourne Again Shell 3 51 95 180
Perl 2 43 36 130
CSS 1 21 3 79
make 1 12 6 53
Lisp 1 5 6 23
--------------------------------------------------------------------------------
SUM: 501 7022 10629 38676
--------------------------------------------------------------------------------
Take those categories with a grain of salt. For example, the darcs source does not have any lisp files. It is interesting that git has 200k more lines than darcs. I'm not sure what that says. C is far more verbose than Haskell? Although, that's not really fair because they also have an order of magnitude more shell code. If you're just comparing C to Haskell it's a factor of about 4.
August 23, 2010
Darcs News
darcs weekly news #71
August 23, 2010 07:41 PM UTC
News and discussions
Darcs 2.5 beta 4 was released:
Ganesh put online his branch of darcs containing the ``rebase'' command for all to try. Simon Marlow gave feedback on this feature:
Joachim Breiner uploaded ipatch on hackage:
Issues resolved in the last week (3)
- issue1875 Eric Kow
- issue1898 Eric Kow
- issue1913 Ganesh Sittampalam
Patches applied in the last week (31)
See
darcs wiki entry for details.
August 22, 2010
Joachim Breitner
ipatch on hackage
August 22, 2010 07:05 PM UTC
With the Beta 3 release of Darcs 2.5 on Hackage (the Haskell library and program repository), the ipatch program I recently introduced could now be uploaded to hackage, too. If you use cabal-install, you can now install and use it with a simple run of "cabal install ipatch".
I also made the program now handle patches that add or remove files, extended the help texts a bit and added a test suite. This means that you can actually make use of ipatch as of now, to split patches into several small patches and to apply a patch interactively. Of course it needs some more testing, and you might have feature wishes – in either case, let me know.
August 15, 2010
Darcs News
darcs weekly news #70
August 15, 2010 09:23 PM UTC
News and discussions
Darcs 2.5 beta 3 was released:
The next Darcs Hacking Sprint will take place at Orleans, France, the 15-17 of October:
Version 0.1.9 of darcs-benchmark was released, including bugfixes and experimental support for comparison with git and mercurial:
Adolfo Builes blogged about the end of his summer of code project. A summary of his project is available on the wiki:
Issues resolved in the last week (7)
- issue1290 Eric Kow
- issue1530 Eric Kow
- issue1599 Adolfo Builes
- issue1873 Petr Rockai
- issue1896 Ganesh Sittampalam
- issue1908 Petr Rockai
- issue1909 Petr Rockai
Patches applied in the last week (49)
See
darcs wiki entry for details.
August 12, 2010
Adolfo Builes
GSoC Week 12
August 12, 2010 03:46 AM UTC
Wow, this is my last gsoc related entry, I'm publishing earlier because I won't be around during the weekend.
I mentioned last week that we had know a better mechanism to handle bad cache, during this week I worked mainly adding documentation, extending user manual and finally sending the patch to set the environment variable DARCS_CONNECTION_TIMEOUT.
Overall my experience with the people from Darcs was really good, giving that this was my first experience contributing to an open source project, sometimes I had moments when I felt really awkward, but my mentor and the rest of people on irc will kindly help me to understand the stuff I didn't know.
With Eric (my mentor) things went pretty neat, We would meet weekly and discuss about what I had done during the previous week, what I accomplished, my doubts, and what I would work next. He always tried to keep a culture of get things done, and at the beginning when I wasn't very familiar with Darcs he would help me to get more familiar doing some "quiz" questions, which would take me to an "aha!" moment and finding some answer by myself.
What's next ? I plan to continue contributing with Darcs, for me the most difficult part in an open source project is getting started, I think I have passed through that, and I want to keep the momentum, I will try to keep contributing as much as I can.
I would also like to say thanks to the following people on irc who somehow help me when I appeared there asking questions: kowey, mornfall, lispy, Heffalump, sm.
Thanks to the Darcs team and I hope to continue having fun and learning lots with you.
All the documentation of my project is in the wiki.
August 09, 2010
Adolfo Builes
GSoC Week 11
August 09, 2010 04:00 AM UTC
In my last entry I mentioned I had already sent a patch, which would allow Darcs to have a better handling of bad caches, I'm happy to announce that my patch is now in the current head :).
Now we have a better handling of bad entries in the cache, some of the main benefits of this fix is that some operations that would take more than 10 minutes were now reduced to less than 1 minute, the time changes depending the number of patches in the repository( look my report of week 8 for a better idea of what was happening).
Also I mentioned I was having some issues trying to use the timeout function from System.Timeout in Windows, I wrote to haskell-cafe and someone claimed that It would work for him in Windows 7, and then other person claim the same behaviour I was having with Windows XP, then I tried in a friend's computer, and it'd work intermittently, so the conclusion is that this is a Windows systems related issue and not a GHC's as I though at the beginning, I would really appreciate if someone with more experience with Haskell and Windows systems could point out what is really happening.
For the coming week, I would focus on extending the documentation, sending my implementation for the timeout flag, and completing the section "Future work" in the high level document.
Alexey Levan
GSoC 2010 Progress Report #3
August 09, 2010 03:30 AM UTC
Last week I was developing a smart server for Darcs. The main challenge in designing a server was that the current code that works with the repository is rather low-level and work on file basis, and making it to work with the smart server is rather untrivial task.
To solve this problem, I implemented a common interface for working with both smart and dumb protocols, called RepoIO. RepoIO makes it easy to add new protocols for Darcs in the future, as well as provides a convenient high-level API for Darcs library users.
The disadvantage of this approach is the need for re-implementing of the download code (for example, at the moment RepoIO does not support HTTP pipelining). However, the current code for download needs refactoring anyway, and getting a clean API with a fresh realization in my opinion justifies the rewriting of some pieces of code.
At current moment changes to the pull command to work with RepoIO have been already made, and soon the first results of smart server's work will be seen. Next week I will finish the implementation of RepoIO for the dumb and smart protocols, as well as the implementation of the server part for CGI and local (to work via SSH). Together these changes will make a working realisation of smart server that is able to serve get and pull commands. Next Sunday I'll make my final report about the completed work.
August 06, 2010
Darcs News
darcs weekly news #69
August 06, 2010 08:58 PM UTC
News and discussions
darcs 2.5 beta 2 was released. Give it a try!
Reinier explained why beta 3 will have to wait a little:
Joachim "nomeata" Breitner released a ipatch, a tool based on darcs' hunk editing feature:
Two more blog posts from our Summer of Code students Adolfo and Alexey:
Issues resolved in the last week (2)
- issue1888 Petr Rockai
- issue1892 Petr Rockai
Patches applied in the last week (38)
See
darcs wiki entry for details.
August 03, 2010
Joachim Breitner
The problem: Splitting patches
As a Debian maintainer, I often work with patches (files listing changes to text files), for example when tracking the modification I make to some software before I upload the package to Debian. To manage these patches, quilt is a nice tool: It helps you maintain a stack of patches on top of the original code and encourages you to keep your variously modifications separate.
One use case is not supported by quilt at all: Splitting patches. One often has a large patch containing several independent changes. This might happen after you fix a few problems in the upstream code and then run dpkg-buildpackage, which will create one patch of your changes and put it in debian/patches. Before, I had to manually edit the patch and write the hunks, which are the building blocks of patches, into separate file.
Where it already works
There is no such problem when using a version control system, such as Darcs. Especially Darcs is rightly famous for its user-friendly interface and powerful hunk-selection features. You can even split a single hunk (which could be a change to one line) into two separate steps! Have a look at the HunkEditor page on the Darcs wiki to see how that works.
Let’s steal a feature
Well, it is not stealing if it is Free Software... Darcs has these nice capabilities and provides them in the context of version control systems, while we need them in the context of patch files. But Darcs is providing an API to its code, so shoudn’t it be possible to create a program that uses the Darcs code to split patch files? As a matter of fact, it is possible: You can see that program in action on this 3min Ogg Theora-Video or directly here if your browser supports HTML5:
<video controls="true" height="448" src="http://www.joachim-breitner.de.nyud.net/various/ipatch-demo.ogv" tabindex="0" width="704">Looks like your browser does not support HTML5</video>
Nice, can I use it?
The code is a working proof of concept. What you see works. You do not see how it handles patches that create or delete files, patches that do not apply cleanly or are already applied or any kind of error handling. That does not work yet. If you still want to try it, you can grab the code from the Darcs repository at http://darcs.nomeata.de/ipatch, but you need to build the latest development state of the Darcs library first.
I think ipatch could become a very useful and powerful tool with applications in areas where nobody would think of using Darcs. I definitely want some integration into quilt, replacing the splitted patch in the series by the replacing patches automatically. Maybe even a git plugin could be created? But I don’t think I can push this project far enough on my own. So this is an invitation to join me and make ipatch a great tool. This invitation goes especially to the Darcs developers: Please have a look how the code uses the Darcs API and help to improve the collaboration here. I think we can use the darcs-users mailing list until there is need for a dedicated mailing list.
Alexey Levan
GSoC 2010 Progress Report #2
August 03, 2010 06:55 PM UTC
Last week I spent improving and debugging the code of repository packs. For those unfamiliar with this feature: repository packs are two tarballs, basic.tar.gz and patches.tar.gz, containing the copy of repository contents. They are used for faster getting a Darcs repository over the networks, and will be created by 'darcs optimize --http' command, when it will be enabled.
The main changes are:
- while getting a repository via packs, the files hardlink to a darcs global cache,
- small tuning of packs format, to support the parallel get using cache,
- further development and debugging of code for parallel get,
- optimization of packing of inventories.
The last change resolves issue1889, allowing to get rid of unnecessary files in basic pack, thereby reducing its size. The fact is that with use of a repository, the inventories directory can accumulate quite a large number of unneeded files. In the Darcs unstable repository case, these files represent a significant part of basic pack: unoptimized pack takes 21MB vs 1.7MB with optimization.
Also, I've figured out what caused the issue1884 and was giving the wrong message about the success of incomplete darcs get. It turns out that Darcs.Command.Get has the interrupt handler that covers almost the entire code of darcs get, and unconditionally reports about the success on interrupts. The fix is easy, but it conflicts with the rest of my work that is waiting for review, so it will have to wait a bit too.
By the way, this fix will help to do one more optimization, because it clearly defines the time of getting the lazy repository. It turns out that in order to get a lazy repository, it's not necessary to download the entire basic tarball: inventory files at it's end may be obtained later lazily. With this optimization, darcs get of lazy packed repository will download the same files as the "classical" darcs get, only faster.
While getting the packs can be much faster than getting a repository file by file, it also may be much slower if the repository files have been saved in the cache. However, this case also does not necessarily win, e.g, the cache may be on the network share behind a slow connection. Conversely, the "remote" repository can be at arm's length. Or even on the local host. As you can see, things get a little complicated here, and certainly there will be cases when trying to be too clever and guess the way to get the repository (file-by-file using the cache, or using packs) will fail.
The way I solve this problem is actually simple: why to choose between two options, if you can use both? So I added to the beginning of basic pack list of files it contains, in reverse order (patches pack doesn't need it, as it can be inferred from inventory). Now when you get the repository the pack is loaded, and when a list of his files is received, they are obtained in parallel, in reverse order. Downloading files from different ends of the list, both download threads eventually discover that the file that they are going to upload already exists. At this point their work ends: pack download is completed.
The only remaining issue of packs I know about is the download realization. The fact is that the current code for downloading files in Darcs lets you use the file only after the download is complete, which is not suitable for my way of using the packs in parallel with the cache. Since I am going to write custom downloading code in my upcoming smart server work, I think it will be easier to provide a common interface for both smart and dumb (including lazy) downloads, instead of trying to alter the current code, which was not designed for lazy download.
Now, when I've finished with most of repository packs (well, almost; there will probably be a couple of rounds of review-amend ping-pong on the Darcs bug-tracking system), I'm starting to write the code for smart server. After the server's interface design (I'll post the specification on the Darcs wiki), I'll start with writing the client side (it will help to solve the problem with downloading tarballs and put an end to my work on the repo packs sooner). I'll make a next post about my progress on Saturday, August 7.
July 26, 2010
Adolfo Builes
GSoC Week 9
July 26, 2010 03:14 AM UTC
Last week I mentioned I almost had ready the implementation to deal with bad entries in the cache, finally during the week I sent a first version of the patch, I got some review and now I'm working to improve it.
One of the things I will change is the way we determinate which error we had when trying to use one of the entries in the cache, in my patch I was using the error string ( but it was because at the moment we don't have an ADT for some errors, for example the ones which are thrown from libcurl), so I will implement an new data type which will helps us to determinate in a safer way, which was the error that we got.
Other thing I will rewrite is the way to determinate if a ssh is bad or not, again we have the problem of the error type with ssh ( which we couldn't infer), so what I was doing was to do a request to the server and check if it was reachable or not, but then I realized that It wasn't correct, having a ssh server listening in port 22, doesn't necessarily mean that it does in port 80 too.
Also I got a first draft of the high level documentation doc, which aim to explain how the cache system works, I did a call in the darcs-users mailing list for feedback, which wasn't very successful, I would really appreciate if you can give a look at it and give me some feedback.
During the coming weeks I will finish this patch, and work on documentation and testing, more information of my progress can be found in the wiki.
July 24, 2010
Darcs News
darcs weekly news #68
July 24, 2010 10:29 PM UTC
News and discussions
``darcs stash'' was discussed, with different possible implementations and UI proposed, and some example workflows evoked:
Reinier listed the release blockers for darcs 2.5:
Issues resolved in the last week (4)
- issue1716 Reinier Lamers
- issue1883 Eric Kow
- issue1887 Petr Rockai
- issue1893 Ganesh Sittampalam
Patches applied in the last week (42)
See
darcs wiki entry for details.
July 21, 2010
the Patch-Tag blog
I started patch-tag coming on two years ago, and a blog post has been brewing about what went right, what went wrong, what I didn’t expect (or what was harder than I thought) and where I see things going. Here goes.
UPDATE: After I posted the initial version of this communique, a couple of people have asked if patch-tag is here to stay. Short answer. Yes
Long answer, see my reply to Eric Kow below.
What went right.
Haskell is a joy to work with, the haskell community continues to be friendly for the newbie and fun for the expert, better now than when I started. The explosion of creativity on hackage has kept making simple things simpler while pushing the limits of what you can experiment with, with a simple library include.
Darcs worked out too. It is still my favorite versioning system and I use patch-tag for offsite versioning almost all my new work, and keep trying to convert my friends and cubicle mates to it. Darcs users are nice and a lot of them see to be working on interesting things.
Happstack as a server — overall I’m sold, though there are a lot of rough edges, and I watch snap and yesod with interest. With happstack too many monads and the documentation is sparse… but it got the job done for me, so ok. I really want to do a yesod test drive too — Michael Snoyman has put a lot of himself into this framework, and from browsing the docs and watching the intro videos I have to say he is doing an awesome job.
View Code: HSP/HStringTemplate.
HSP is a bit of an odd one becuase I don’t use HSP in patch-tag, but I use it for my other happstack work, and I like it. It isn’t all win. If you are using HSP you have to recompile every time you change the view code, and the time lag can get noticeable, particularly if you have monolithic View files with a lot of code, and rely on cabal install. HSP also relies on the trhsx preprocessor which means that projects that require it won’t cabal install ootb without ~/.cabal/bin
(or similar) is in your PATH. And there are some mystifying error mesages. I think it also uses template haskell, so slightly slower compiles. But even so, I like it. I like having the compiler check that my tags match up, I like being able to copy html from the wild directly into my view code, and I like that designers find it easy to work with.
HStringtemplate is what I use in patch-tag and I like that too. The compiler won’t check my html tags match up, but it is fast, no recompile is required, and it is pretty smart. I will probably depend on hsp in the future, and might even switch patch-tag to hsp at some point, but overall I have to say hstringtemplate worked out pretty well.
Linode/amazon cloud. I use linode to host patch-tag, and the aws cloud to occasionally spin up a dev server and experiment. Odd combination perhaps but so far so good.
Help from unlikely places. At different times and in different ways Matt Elder, Dan Patterson, and Ram Durbha had a major impact in moving patch-tag forward, for no tangible material reward. I hope to get more detailed in another post, but for now I would just like to say thanks. These guys taught me, maybe the most important lesson I learned from patch-tag: that the universe is basicaly a friendly place.
What went wrong.
The hyphenated name. Mama mia. Why didn’t I just call it patchtag? The actual reason is that I showed logo mockups with and without hyphens to a couple of friends and everybody seemed to like the hyphen version better. If I could go back in time and do-over, I’d probably ditch the hyphen.
The macid data store. I will probably be switching patch-tag back to some more traditional data store, probably a vanilla db. No one big reason, just a lot of small ones.
Well there is one big one. Patch-Tag app state sometimes just loses data. So, a new user account will get created and then vanish. I think this happens when the out-of-memory killer kills patch-tag before an event has been written to the macid serialization log. Honestly though, I’m not sure. Patch-tag can soldier with a bit of human intervention because the most valuable artifacts (repositories and namespaces) are on the file system and not in macid. Still not good.
Then there are the minor issues with macid. Small user / dev community. The documentation just isn’t all that great, and not that many people know how to use it. It was started by the happstack originators (alex jacobsen and co I suppose), and found a home in the current happstack cabal — kudos to jeremy shaw and the other regulars on the happstack list. But it just somehow never seems to have “clicked” as a technology from my perspective. It seems to me, somewhat impressionistically I admit, to use an inordinate amount of memory, and slow down compile time — all those template haskell directives. And I don’t really understand it all that well even after using it for a long time. Finally, macid isn’t as easy to roll back on corruption as I had initially thought — you have to edit a binary file and… ugh. Macid *wants* to be a silver bullet kind of nosql solution that will scale to multiple servers transparently, use native data structures, and make life easy for the developer. It is almost that, but it isn’t there yet and other
nosql solutions have gotten a lot of traction in the meantime.
Finally… it is somewhat painful to admit, but patch-tag really doesn’t need transactions. I probably could have just used text files and read/show serialization for state. What the heck was I thinking? I can’t even remember anymore.
Cabal/dependency management. I was a bit conflicted on whether to put cabal under the things that went wrong category because I have no intention of giving up cabal and it is clearly a core technology for this kind of project. That said, dependency wrangling proved to be a continuing time and energy drain throughout the project. On a nearly daily basis it seems, things would install one day and wouldn’t the next because somebody had updated a dependency on hackage and this broke something upstream. It doesn’t seem to happen that often in small to medium size haskell codebases, but projects that have a lot of moving parts (eg use happstack) appear more vulnerable to this problem. I suspect yi, gitit, gtk2hs have a similar installer experience.
There is a second thing about cabal that actually has an easy fix, but I didn’t discover for a long time. (Thanks Matt Brown for pointing this out!) Cabal install, which was how I compiled for the first year or so, started taking a long time. Like, over a minute. The fix is to just compile with ghc –make, after using cabal install just the first time to pull in all dependencies, and periodically to get updates. This seems to usually go about 4 times as fast, and when trying to stay in flow mode 15 seconds versus over a minute can start making a big difference.
Gitit fork. Gitit on the whole, I am happy with. I don’t think that many patch-tag users have taken advantage of this feature, including myself, but I really like the portability of keeping documentation alongside repo. Cross polination with another major happstack consumer is another win. Where I have regrets is having forked the gitit code somewhere along the line to get some look and feel features I wanted because I was in a hurry. Now gitit mainline has progressed several versions and I have to undo the fork I did and clear up all these niggling details to sync up, or maintain the fork forever. Whoops.
Unix security model alongside app security model to get ssh working. This is a mess for portability. If I want to switch servers I need to recreate system accounts, ssh keys, it’s a real drag. Alex Suraci’s Darcsden has an ssh server built in, written in native haskell, and apparently this makes it a lot easier to maintain. Transitioning patch-tag to this library and ditching the unix security details is a high priority for me in terms of future maintainability.
(Eventual) Moneti$ation. I thought I would go after darcs first, add git/svn other systems once darcs was solid, which would be a big enough market to make a sustainable business. Turned out not to be so easy!
On several occasions I was about to do payments, but then I got distracted fixing something without which it wouldn’t have felt right to start charging. Looking back on it, I kind of wish I had already gone after paying customers by this point.
The good news for me on a personal level is that having done patch-tag has really helped me get the kind of work I want to have, and advocate for using the technology I like, namely functional programming.
*********************
End ramble down memory lane, and announcing open source.
Patch-Tag source is open source at
http://patch-tag.com/r/tphyahoo/patch-tag-public
The install is non trivial, so for anybody that wants to check it
out without jumping through a whole lot of hoops (highly recommended) there is an amazon ami. Find the ami code by getting your amazon key, setting up ec2 command line tools and executing
thartman@ubuntu:~>ec2-describe-images -a | grep -i patchtag
IMAGE ami-febe5597 072945664613/patchtag dev
ami 072945664613 available public i386 machine aki-5f15f636 ari-d5709dbc ebs
You can run this ami from the comand line with ec2-run-instances, or use the aws gui or any of the numerous third party guis. Once you’re in, cd to the patch-tag directory, darcs pull the latest code, cabal install, and you should be good to go.
The main reason I am open sourcing patch-tag is to stimulate the haskell web-devel ecosystem by putting a “real world ready” app out into the open.
If anybody playing with the opened patch-tag wants to help me out with the project, here are my highest immediate priorities.
* Help me choose an open source license for patch-tag. I am considering gnu, bsd3, and CPAL (same license as reddit). What does the community think?
* Get patch-tag easier to install and on hackage. Patch-tag is not on hackage, because my feeling is that a program on hackage should Just Work or it violates legitimate expectations, and patch-tag cabal install is far from just working. Anybody is free to throw patch-tag on hackage under their own user account but I would say better to wait because the install issues *are* fixable with a little love and care.
Key subtodos for this are
* Unfork gitit and peg patch-tag to mainline gitit
* Use same ssh lib as darcsden so I can ditch most or all of the linux sysadmin stuff
* Make patch-tag machine instance for platforms other than ec2 (eg virtualbox, vmware, make it runnable on windows, etc).
* Diagnose and fix a suspected memory leak that results in regular visits from the dread oom killer.
* Transition patch-tag to a better understood / supported persistence solution than macid. My current thinking is sqlite, or possibly just text files with read/show serialization.
* Work on macid, and help realize the macid dream of straightforward nosql style persistence that uses native data structures for an easy start and scales to zillions of users with no unpleasant surprises.
Finally, and most importantly: Use haskell web-devel. Write documentation, ask questions, get on irc, and make the world a friendly place for people that like haskell and want to write the next facebook/ita/viaweb killer.
Thanks for tuning in, folks, and happy tagging!
July 17, 2010
Adolfo Builes
During this week I have been working in the implementation to deal with the bad entries in the cache, to get a bit of background of how the cache works read my previous post.
When we have remote entries, we consider conflictive the ones that 1) Are not accessible ( give time-out each time we try to access) and 2) are accessible but the repository doesn't exist.
We consider a bad local entry, the one which points to a repository which doesn't exist.
At the beginning the idea was to automatically remove those entries from the cache, but as I mentioned in the previous post, sometimes there are externals factors which won't allow me to reach an entry, so the approach taken to solve this problem was just not using that entries during the rest of the command, and at the end notify to the user about it and allow her to delete the bad entries interactively, something like:
$ There seems to be a problem with the following repos: repo1, repo2, repo3
Would you like to delete them from the cache ? Y/N
I haven't implemented the part of notifying to the user about the bad entries, but I introduced the changes to stop using an entry if we have a problem with it, so for the remote repositories which give time out, they get added immediately to the list of bad caches and we don't try to fetch patches from them, if it's a remote entry which doesn't give timeout but throws an error it could be because a) the requested file wasn't there b) the repository doesn't exist, so in this case when I get an error of this type I verify what caused the error, if the reason is that the repository doesn't exist, it is added to the list of bad caches, if not, it is added to a list of "good entries", if we get again other error because the file doesn't exist, we won't need to check for any of the conditions mentioned before.
An approach similar to the remote repositories is taken for the local entries which fail.
After I introduce the changes, I did a lazy get of Tahoe-LAFS, then introduce a bogus entry for the timeout case and check the time difference calling the command darcs changes, I did it with my version and with the version in hackage.
For 2.4.4 (release) which is the version in hackage, it took a bit more than 13 minutes.
real 13m38.415s
user 0m4.148s
sys 0m1.060s
Then after I rebuild with my changes the result were really good, taking less than 1 minute to fetch the changes:
real 0m55.679s
user 0m1.092s
sys 0m0.240s
Before my changes Darcs will try to establish a connection with each of the entries in the cache for each patch ( waiting for timeout), now with the changes, if it founds a bad entry it doesn't try to use it in the rest of the command, which make faster the operations since we don't waste time trying to use bad resources.
Also I implement an environment variable "DARCS_CONNECTION_TIMEOUT" which set the waiting time for a request.
To implement such functionality if we are using libcurl, I setup the CURLOPT_TIMEOUT option, if we are not using libcurl but Network.HTTP, I wrapped the operation of simpleHTTP in the function timeout from System.Timeout. In Linux it works perfect both with libcurl or simpleHTTP, but in Windows I have a problem with the 'timeout' function from System.Timeout, it doesn't behave as expected and it seems like it gets ignored.
For the next week I will focus on finishing this, extending the haddock documentation of the cache and tests. Remember you can always check my advance in the wiki.
July 11, 2010
Darcs News
darcs weekly news #67
July 11, 2010 05:13 PM UTC
News and discussions
Reinier announced the first beta of Darcs 2.5:
The next darcs sprint will happen in October in Orleans, France:
Issues resolved in the last week (6)
- issue1288 Ganesh Sittampalam
- issue1726 David Markvica
- issue1825 Petr Rockai
- issue1845 Petr Rockai
- issue1865 Petr Rockai
- issue1871 Petr Rockai
Patches applied in the last week (83)
See
darcs wiki entry for details.
June 28, 2010
Darcs News
darcs weekly news #66
June 28, 2010 12:33 PM UTC
News and discussions
Reinier announced the release schedule of Darcs 2.5 (soft freeze July 8th, release August 7th):
Reiner also issued a call for volunteers for fixing unassigned bugs that should be fixed for the next release:
Eric explained how to make the patch reviewing process more efficient:
And the Sumer of Code blog posts of the last two weeks:
Issues resolved in the last week (7)
- issue1176 Adolfo Builes
- issue1277 Eric Kow
- issue1389 Reinier Lamers
- issue1713 Eric Kow
- issue1857 Petr Rockai
- issue1877 Florent Becker
- issue1879 Eric Kow
Patches applied in the last week (53)
See
darcs wiki entry for details.
June 26, 2010
Adolfo Builes
GSoC Report: Week 5
June 26, 2010 08:41 PM UTC
In my last post I said I had complete the warm-up phase and that I will start to think about how to handle caches which are no longer available.
The caching mechanism relies in the files _darcs/prefs/sources and ~/.darcs/sources, basically the content of those file is used to generate the cache entries, each of the entries in that file indicate an alternative source to get files. If we want to specify global caches we put that in ~/.Darcs/sources but if we want an alternative repositories to pull from, we specify that in the repository sources file which is in _Darcs/prefs/sources, also each time we do a pull from an external repository it is added to the sources file
automatically.
The problem of expiring caches is given because sometimes it happens that repositories that were available, can become unavailable. For example if I had pulled from 3 different repositories and 2 of them stop being available, it could take up to 2 minutes to get each patch, because Darcs could try to fetch every patch it needs from those 2 not longer available repositories, it tries to establish a connection, an then waits for a time-out or a bad response code. After the problem I just mentioned, the idea is to design a mechanism which can help Darcs to establish which entries should be expired.
We can split the cache entries in two groups: locals and remotes. Dealing with local non-longer reachable repositories is not a big deal since if we don't find the local entries we can assume they don't exist and we can drop them from the cache and stop trying to fetch files from them. Remote repositories are more tricky, for example I can't eliminate an entry just because it gives a time-out when it tries to establish a connection with it, there are other external factors which could interfere with that particular entry in a given moment
(firewalls).
So as we seem this is not an easy task, handling remote repositories is out of our hands, we don't have control over the external sources, we don't have control over the network configuration and so on. So a first approach to this is to mark the entries which are not working and ignore them for the rest of the pulling since we don't want to try to establish a connection with an entry which we know is not available. If we try to establish a connection and fails we can mark it as a bad entry but also I think it could be awkward to wait for a 60 seconds time out, something we could implement is a default time for waiting for a connection to succeed (10-15 seconds maybe) if it doesn't happen between that time we can skip it, mark it as bad, and don't try for the rest of the patches that particular entry. Other approach suggested in the bug tracker was to try to establish a connection with all of the entries and use the one that responds first but then what if all the entries are bad entries?. I have to think more about it, I will discuss on irc and the mailing and then with a clearer idea I will start to code a patch to solve the issue.
Also I sent a first version of a failing test for the case of unreachable entries, but I need to amend, as there are some missing cases.
More of my progress can be found in the wiki.
Alexey Levan
This summer I work on making Darcs over networks faster. My project consists of two parts:
- Implement an optimization for getting repository over HTTP. This is done by creating a snapshot of current repository state.
- Create a smart server for darcs, which is able to determine the patches needed by client and send them with response. This will be the most effective way to get/pull a repository, since it reduces the number of roundtrips to minimum.
The original complete description of the project can be seen
on the wiki. Note, however, that for smart server more priority is given to the CGI frontend, rather than plain HTTP.
Changes so far
I glad to report that the most work on HTTP optimization is complete and patches are on their way to the Darcs repository. Some notes on implementation:
- getting an optimized repository results in almost the same copy as for unoptimized one. While inventory files may be split in different ways, semantically resulting repositories remain identical.
- there is still a couple of issues with special cases, like working with cache and handling interrupts; I will hopefully resolve them in nearest time.
Next week
At the moment I have started implementing the smart server. Besides the completing of the work on optimize --http issues, on the next week I will refactor the get/pull commands' code, which will result in cleaner API for the server.
June 20, 2010
Adolfo Builes
GSoC Report: Week 4
June 20, 2010 08:02 PM UTC
I have completed my phase of warm up issues which was oriented to allow me to get familiar with the Darcs system, specifically the cache part. I have sent the patches and they have been applied.
During the week I worked in finishing issue 1176, continued with the documentation part, more specifically describing in a higher level how a patch is fetch with a given hash, making easier for someone with a non-technical background understand what is happening, fixing a test which was failing in Windows when it shouldn't and finding out why the IO operations over hashedRepos were put in a different module.
For fixing issue 1176 I did some modifications over some code I had already sent which was about keeping the caches sorted by locality, the initial idea was that anything which was local should be first in the list and then the remote sources, but we realized that between the remotes also exist a "wanted" hierarchy, basically we would prefer to access first http repos over ssh, so the new sorting keeps all the locals first, http in the middle and ssh repos at the end. One of the problems this solve is that weird behaviour of darcs trying to establish a ssh connection when pulling from a http or local repository.
While I was working in the test which was failing we redefine what gets saved in the _darcs/prefs/sources, the global caches were one of those things, it wasn't necessary to have them there because that's why a global cache configuration file exist (~/.darcs/sources), so basically we just drop anything related with global caches before saving the sources file of a repository.
For the next week I will start to work in the problem of how to deal with the unused cache, the idea is by the end of the week to have a work plan and write a test case.
You can check out more of my progress in the wiki.
June 16, 2010
the Patch-Tag blog
patch-tag gets more ram
June 16, 2010 08:38 PM UTC
June 12, 2010
Darcs News
darcs weekly news #65
June 12, 2010 01:20 PM UTC
News and discussions
Lele Gaifax released a new version of the trac+darcs plugin:
We are still looking for a release manager for the release of darcs 2.5. Eric summarized the discussions concerning recruitment and the responsibilities and challenges related to this job:
Roadmap: rebase won't be in 2.5, annotate will be improved:
Issues resolved in the last week (2)
- issue1210 Adolfo Builes
- issue1874 Eric Kow
Patches applied in the last week (21)
See
darcs wiki entry for details.
June 10, 2010
Jason Dagit
Last time I presented the idea that version control and delimited continuations are related. I left off with a question how how to make Darcs fit this model. I think I understand now what I was missing.
I forgot to think about Darcs operations in terms of the intermediate operations that get performed. In Darcs, everything is based on commuting patches, even merging. Therefore, to see how Darcs fits into this model it's important to think about commuting patches in terms of delimited continuations.
Specifically, I now believe that commuting two patches introduces marks that can be shifted to later.
I have several ideas for the next steps of this. One is to start modeling toy versions of svn and darcs in Haskell via delimited continuations. After that, I would like to figure out the correspondence between the delimited continuations that I've identified and their data structure reification as zippers. I hope to have more about that later.
Judging by a paper written by Oleg, there should be a natural way to convert the delimited continuation representation into a zipper. Investigating this model might shed light on the Darcs patch model, or even lead to a more concise formalism.
June 06, 2010
Darcs News
darcs weekly news #64
June 06, 2010 08:45 PM UTC
News and discussions
An alpha release of darcs 2.5 might happen soon:
Summer of Code: Alexey Levan sent a first version of patches for darcs optimize --http, and Adolfo Builes sent a patch fixing a bug concerning cache pool choice by darcs:
Issues resolved in the last week (10)
- issue1337 Petr Rockai
- issue1503 Adolfo Builes
- issue1610 Petr Rockai
- issue1817 Petr Rockai
- issue1839 Florent Becker
- issue1843 Florent Becker
- issue1848 Florent Becker
- issue1860 Petr Rockai
- issue1861 Eric Kow
- issue1864 Florent Becker
Patches applied in the last week (18)
See
darcs wiki entry for details.
Adolfo Builes
GSoC Week 2
June 06, 2010 07:05 PM UTC
After last week's meeting with Eric ( my mentor) we got a better time line written down, and now I have clear goals for each week until the midterms evaluation, for this week my goals were:
* Complete issue1503
* Complete issue1210
* Description of cache usage for each module in Darcs.Repository
I'm happy to say that issue 1503 was closed, and I sent the patch for issue1210 but is waiting for revision, also I noticed that I'm more familiar with the darcs structure ( at least the part in which I'm working ) and I've started to know where I have to search for something each time I need a particular functionality.
When I first sent the patch for issue 1503, Eric and Petr did some comments about design and style which took me to rewrite the original patch, thanks to them I have learn to push myself more into thinking in a bigger view each time I plan to introduce changes somewhere, sometimes you just develop bad practices which you don't see unless someone else point it to you, so I feel more convinced that a good way to learn and become a better programmer is contributing to this kind of projects, where you interact with other people, where they are commenting on you code, making you think better about what you are doing, all this kind of stuff is something you won't learn from the university, but just getting involve in something in the real world.
Other thing that I learnt was that I had a wrong idea of the concept of tests, I thought the test case for a certain issue would be one were it used to fail before applying the changes, but I didn't thought about how should the test behave in case of failing, here thanks again to Eric for explaining it to me :).
For the next week I plan to work on issue1176, start to elaborate a test plan for issue 1599, write the test cases for issues 1503 and 1210 and continue my work in the document of the darcs cache.
You can always check out my advance and know more about my project in the darcs wiki where we are documenting everything.
Jason Dagit
Delimited continuations give us a way to create markers that we can jump back to. We can construct the future of the computation, work with the computation so far, or abort the current continuation and go down a new path (create a new future).
These primitives have a natural correspondence with version control systems that snapshot the world like
SVN. Focusing on SVN for a moment:
- The current continuation is the transformation of repository state that you're working on. It's the diff you're creating.
- For commits, each revision created by commit is a marker, so we model this with reset.
- Checking out an older revision, or reverting changes, corresponds to a shift. We discard the current continuation and move back to a marker created by a specific reset.
- Updating consists of having the client copy learn the current state of the continuation on the server and applying it to the local copy.
- Starting a branch corresponds to a reset.
- Merging two branches is a bit trickier I suspect. I haven't worked out all the details sufficiently to convince myself I have it right, but here is how I think this case works. The merge first shifts to each of the markers and then combines those two continuations into one future. The part that seems weird to me about this, is that I haven't really seen examples of delimited continuations were the continuation of two different markers (prompts) were combined.
Now, if you accept the above it gives us some intuition to build on. Although my correspondence is terribly informal at the moment, if we took some time to make it formal by working out enough details to model it in, say, Haskell, then we'd have a nice formal backing for how SVN works. I think the above model applies equally well to git, but I'm not confident with git's model.
One insight the above gives, is that the way merging works is not described by the continuations in general. It's up to the exact combining function to determine the merge. We know that an automatic merge can fail in practice due to things like conflicts between the changes in the branches. So, in the SVN implementation considerable work has gone into implementing logic for creating the proper continuation.
Now, in general when a merge, or update, is performed human intervention may be required. This typically happens when the changes are in conflict. What this means for our model is that in general the creation of the continuation requires knowledge outside of our model. What does that correspond to? Well, it's essentially saying that calculating the correct continuation to resolve the merge is non-deterministic!
In other words, this continuation view of version control gives us a rigorous way to talk about our intuition. Of course we can easily tell that merging is going to require human intervention without needing to study delimited continuations, but this framework of reasoning now gives us a more mathematical way to say it.
The next question is: How does the delimited continuation model of vcs apply to Darcs?
So far I'm not sure. I suspect, without working through the details, that in Darcs, every time you record a patch multiple resets happen, instead of just one. The model really breaks down for Darcs because you can seemingly visit points in "history" that did not exist when the patches were created, but they are valid repository states.
For example, imagine a repository that only exists locally. You create a sequence of patches. Now, take a patch in the middle that can be commuted to the end of the patch sequence. Doing so has not created a new repository state; so far this is fine with the above model as no new markers need to be created. Suppose we remove the patch from the end of the patch sequence. This is exactly how darcs unpull works. The funny thing is, we've now created a state that never existed previously. So in the delimited continuation model, what marker did we just shift to?
I don't know the answer yet, but I think it's an interesting question. I suspect there are multiple "correct" answers, but that only some answers will yield elegant and robust models here.
Thanks to Eric Kow, Duncan Coutts and Ian Lynagh we have some great timing data for using darcs2 and darcs1 to push patches over ssh.
Eric wrote a script to test three different scenarios of using darcs to push patches:
- Scenario l1r1:
This is a local darcs1 client talking to a remote darcs1 executable.- Scenario l1r2:
This is a local darcs1 client talking to a remote darcs2 executable.- Scenario l2r2:
This is a local darcs2 client talking to a remote darcs2 executable.
Next, Duncan and Ian provided us with access to 131 real-world repositories hosted at
http://code.haskell.org. We ran the script to push patches to each repository, this gave us a ton of times. Then in Excel we crunched these numbers to see that not only is scenario l2r2 no worse than the other two, it’s actually faster on the time consuming cases!
The one caveat we found is that the minimum start-up time for the first two scenarios is 1 second and in the last scenario it’s 2 seconds. I’m confident we can shave off this 1 second difference in the future.
This is a histogram that shows you how the push times distribute, click on it for a large image. Along the bottom we have how many seconds the push took, and along the vertical axis we have the number of data points in that range. At a glance you can see that most repositories take just a few seconds to push. We can also see that darcs2 is slower on small pushes by about one second. Darcs2 in this chart corresponds to l2r2 and darcs1 corresponds to l1r1.On a side note, we also tested converting all the repositories to darcs2 repository format and that worked great as well. Converting all the repositories at once takes less than 20 minutes on my laptop without a single error. There were a few warnings, but that’s to be expected as potentially exponential merges are fixed in the new darcs2 format, but darcs emits a warning when fixing them.
For anyone that wants to see the raw numbers click
here. The link does work, but not all web browsers are showing the numbers. Opera and FF3 work on some platforms and not others.
On March 20th, 2009, I successfully defended my Master’s thesis in Computer Science.
Abstract:
Ensuring correctness of real-world software applications is a challenging task. Testing can be used to find many bugs, but is typically not sufficient for proving correctness or even eliminating entire classes of bugs. However, formal proof and verification techniques tend to be very heavy weight and are simply not available for day to day use in many common programming environments.
We demonstrate a form of light-weight proof assistant by using the type checking features of the programming language Haskell with existing extensions. We apply this work to the Open Source version control system Darcs. The properties checked by our approach are derived directly from the data model used by Darcs. This allows us to eliminate entire classes of bugs at compile time. We also examine how these techniques improve the quality of the Darcs codebase and the challenges that arise when applying these techniques in practice.
You can read the full thesis here. The slides from my presentation are located here.
The bottom line is that we used Generalized Algebraic Data Types (GADTs) to enforce proper patch manipulations. In the Darcs implementation, patches are stored in sequences and rearranging those sequences can only be done is very specific ways. Our use of GADTs allowed us to express those constraints using existentially quantified types, phantom types, and witness types. If you’ve ever wondered how to use GADTs in real-world software, this serves as a very illustrative example.
Understanding Darcs Commute
June 06, 2010 08:14 AM UTC
People often want to understand how commute on patches works. Usually we start by saying:
Given two patches, A and B, if A and B commute then: AB <--> B’ A’, for some B’ and A’.
Naturally people ask, “But what is the relationship between A and A’ or B and B’?” This is a very important question and I’ll provide you with some insight.
Suppose we have a repository with 2 files, a and b. We could then make the following operations:
You can think of each operation as a transformation on the ’state’ of your repository.
Suppose also, that we make an edit to a, and an edit to b.
Let’s name the above, using T for transformation:
- T_bc = mv b c
- T_ab = mv a b
- T_ca = mv c a
- T_a = edit a
- T_b = edit b
You can imagine that if I gave the diff for T_a and the diff for T_b that you could apply those diffs in either order to your repository and get the same final ’state’. Meaning, a and b are the same whether you update a first or b first.
But, suppose instead that I performed T_bc, T_ab, and then T_ca. This has the effect of swapping a and b by name. Now suppose you applied the diffs T_a and T_b. What would you want the outcome to be?
It turns out, that it matters which operations were created first. If you created the diffs T_a and T_b *before* you did the operations of the swap, then you should expect that after the swap the diff for T_a actually modifies b, whereas T_b should modify a. On the other hand, if you created the diffs T_a and T_b *after* the swap, then you expect T_a to modify a and T_b to modify b.
We have an intuitive idea of ‘context’ now. As in, what is the context that T_a and T_b were created in? Knowing this will tell us how they transform the repository state.
Intuitively, it seems as though we need to remember the ‘context’ in which T_a and T_b were created. So let’s say that the operations performed up to the point where T_a is created is the context of T_a. In other words, the context for T_a is sequence of transformations that existed when T_a was created. Similarly, since T_a is a transformation, creating it results in a new context, which is the old context plus T_a. We could say that T_b has this context. Going a bit further, it seems like we should talk about how T_a has a pre-context and it also has a post-context.
For example, if we created T_a before doing the swap, then the pre-context might include two transformations, one that creates a and another one that creates b. The post-context would then include those two transformations and T_a itself. If we created T_a after doing the swap, the pre-context and post-contexts of T_a would include T_bc, T_ab and T_ca also.
Now a side note about commutative functions. Consider the function created by composing T_a and T_b, let’s write T_a . T_b. Recall, that with function composition parameters start on the right and pass through the sequence to the left. As discussed in the intro, T_a . T_b is equal to T_b . T_a. This is because T_a and T_b are independent of each other. Thus, we would say that the functions T_a and T_b are commutative functions. This means, that changing their order of application does not change the result.
We are saying that:
T_a . T_b = T_b . T_a
Because T_a and T_b are commutative it doesn’t matter which order we compose them. If we restrict our view to just the repository above with only the files a, b and no c, then on this restricted set of repository state how do these two compare?
- T_b . T_a
- T_a . T_b . (T_ca . T_ab . T_bc)
In plain English, the first one edits a and then b, the second one swaps a and b, edits b and finally edits a.
As far as the mathematics of it is concerned, the first one will edit a and b, while the second one will have T_a editing a different a than the first one and T_b editing a different b than the second one.
Going a bit further, let’s say that T_a and T_b were created without any of T_bc, T_ab or T_ca in their context. So we could have two scenarios.
We could, for example, start with T_b and T_a, swap their order and then do the swap of a and b afterwards. That would give us:
T_b . T_a
and
(T_ca . T_ab . T_bc) . T_a . T_b
Intuitively, it seems like T_a and T_bc are commutative functions, eg., T_bc . T_a = T_a . T_bc. So we could rewrite the second one as this:
T_ca . T_ab . T_a . T_bc . T_b
Now, suppose when we commute the function T_a with T_ab, that we replace T_a with T_a’. T_a’ is like T_a except that T_a’ makes the edits of T_a to b instead of a. After all, this results in T_a’ editing the correct file after the rename. Similarly, when we commute T_b with T_bc, T_b is replaced with T_b’ that edits c instead of b. When we commute T_b’ with T_ca we replace T_b’ with T_b” that edits a instead of c.
So, the above goes through these steps:
- T_ca . T_a’ . T_ab . T_bc . T_b (commute T_a to the left)
- T_a’ . T_ca . T_ab . T_bc . T_b (commute T_a’ to the left)
- T_a’ . T_ca . T_ab . T_b’ . T_bc (commute T_b to the left)
- T_a’ . T_ca . T_b’ . T_ab . T_bc (commute T_b’ to the left)
- T_a’ . T_b” . T_ca . T_ab . T_bc
The last one will then have T_a’ and T_b” making edits the same file contents as T_a and T_b respectively, even though the names of the files were changed by the swap.
So, if you’ve followed me to this point, then you now have the intuition for what we mean when two patches A and B, commute to B’ and A’, as AB <--> B’ A’. You can think of a patch as being one of the above transformations along with the context of the transformation. You might also notice that commute of patches must be doing something to the context of the patches.
Patch commute has the potential to update the context and transformation the patches it swaps OR it could update the context and leave the state transformations equal to what they were in the input. Patch commute can also fail, but we’re ignoring that case for the moment.
Thinking back to how we arrived at the need for context, you might notice that for each context, that is each sequence of operations, we get one unique repository state. This is a very important property of context. Without it, context wouldn’t really be useful. Also, notice that the opposite is not true, repository state does not determine the context. Which makes sense, because there are lots of operations you can do that get the repository to a particular state, so given a state how do you know what was done?
The next important property we want for commuting patches is that once two patches have been commuted, you can commute them again to undo the commutation. In fact, it turns out the examples above are saying we want contexts to determine the same state if you commute the patches inside the context (again, context is a sequence of patches!).
For R to be an equivalence relation, we need three things:
- x R x, is true for all x
- if x R y then y R x
- if x R y and y R z then x R z
Here, we replace x R y with “the sequencing, or order, of x can be obtained by commuting adjacent elements of y”. Roughly how to prove each:
- either claim that 0 commutes satisfies definition of R or check that commute is self-inverting
- relies on self-inverting nature, I think
- messier but should still be provable
I’m pretty sure both (2) and (3) could be done with a brute force proof that considered all the pairings of patch types in their general cases. Start with all sequences of length 2, then 3 and I think at that point you could make an inductive argument to hit sequences of length n. This would be a lot of work, and I’m not convinced it could be fully automated.
Why would we want to show the above? Showing that R is a relation would tell us that sequences of patches are equivalent under commute. Now, combine this with the idea that context determines the state uniquely and now we know sets of patches uniquely determine your repository.
The weekend of 24-25 October 2008 was an International darcs hacking sprint! The sprint was a lot of fun and we’ll be having more. The sprint provides a very productive atmosphere for hacking.
We had a team in Brighton with posts from Day 1 and Day 2. We also had team in Paris but I don’t have a link for them.
Here are just some of the highlights from the Portland Team:
- Adding language pragmas in all files:
- makes the code cleaner when it’s time to drop ghc6.6 support
- all required language extensions are now known
- makes it easier to check for Haskell’ compatibility
- Removed OldFastPackedStrings
- Replaced FastPackedStrings api in favor of Data.ByteString api
- lots of small optimizations, less pack/unpack, more standard
ByteString code - removed a fair bit of C code, new code compiles to same or
faster assembly (Don checked)
- cabalization:
- no autoconf or make needed
- cabal install tested and working on linux / osx, windows testing
soon to follow - builds out of the box w/ 6.8 and 6.10
- configure is much faster
- module graph (depends on cabalization)
- Duncan improved zlib
- soon to be available on hackage
- allows us to replace our own implementation of zlib bindings
with the main stream one - will make building on windows easier
- can use lazy bytestrings
Here are some random pictures from the Portland Sprint.
Packages we should consider:

The TODO list we made on the first day:

Checking on Team Brighton:

Duncan and Jason looking at the projector:

Ah, beautiful Portland in fall:
May 30, 2010
Adolfo Builes
It has been almost a week since hacking started, during the week my activities were focused in getting to know Darcs code, working in the skeleton of a high level document for the cache system, and looking at one of the warm-up issues, I won't say I already know how it all works, but I'm picking up new things each day and that's fine :).
From the Darcs code section, I know most of my work would be done in the module Darcs.Repository where the cache code lives, I still need to go deeply and see why, how, and what for is it used in each of the submodules, and how the cache system fits in all the system.
The first sketch of the skeleton for the high level document can be found in [1], we are putting all the administrative stuff in [4] to help us to keep track of everything.
About one of the first warm-up issues (1503), I found a possible way to fix it, you can read more about it in the last comment in [2], I haven't sent a patch yet, but I hope to have it before Tuesday.
While I was looking at this issue we noticed something weird in the log and we found another issue [3], I have a test left to do, and check if that is a problem specific to the installation in my server or is something from the current branch.
For the coming week I plan to:
- Close issue 1503.
- Pick up other issue.
- Elaborate more in the "Internal" section of the document.
[1]- http://wiki.darcs.net/DarcsInternals/CacheSystem
[2]- http://bugs.darcs.net/issue1503
[3]- http://bugs.darcs.net/issue1854
[4]- http://wiki.darcs.net/GoogleSummerOfCode/2010-Cache