planet

July 24, 2014

Ale Gadea

Some Week (14-19 July)

July 24, 2014 04:43 AM UTC

Hi all!

I was finishing understanding and implementing the command darcs send --minimize-context using "optimize reorder" when I begin to suspect that doesn't solve the problem described in here. The thing is, despite the fact that the context in the bundle to send is reduced if before we send we make "optimize reorder", this doesn't solve the problem of dependencies. Guillaume finished of evacuate my doubts, and so after read:

[darcs-users] darcs cannot apply some patch bundles
irclog
issue1514 (which is the issue which "replace" issue2044 darcs send should do optimize --reorder)

I convince myself of what needs to be done, and it's calculate the "exact" dependencies of the patches to send so such dependencies be the context in the bundle to send. "Exact" because for big repositories can be very costly and calculate till certain tag seem appropriate.

Now, one concern is the cost of doing the search of dependencies. About this I can first comment some of the things I was doing during the week and later show, what I think are, encouraging examples. So first, maybe the most relevant thing of the week it's the implementation of the command darcs show dependencies with the following "description":

Usage: darcs show dependencies [OPTION]...
Generate the graph of dependencies.

Options:
--from-match=PATTERN  select changes starting with a patch matching PATTERN
--from-patch=REGEXP   select changes starting with a patch matching REGEXP
--from-tag=REGEXP     select changes starting with a tag matching REGEXP
--last=NUMBER         select the last NUMBER patches
--matches=PATTERN     select patches matching PATTERN
-p REGEXP   --patches=REGEXP      select patches matching REGEXP
-t REGEXP   --tags=REGEXP         select tags matching REGEXP
--disable             disable this command
-h          --help                shows brief description of command and its arguments

till the moment the command returns a graph described in dot language, this can eventually change. But with the current returned value one can do:

$\$$darcs show dep | dot -Tpdf -o deps.pdf to draw the graph in a pdf. Finally, in summary to calculate the dependencies I use more or less the idea which describes Ganesh in here. Moving to the examples is interesting, thinking in the performance of the implementation of darcs send --minimize-context using this approach, to see the followings results: 1. Show the dependencies after the tag 2.9.9 (75 patches) time darcs show dep --from-tag=2.9.9 real 0m0.397s user 0m0.373s sys 0m0.026s darcsDesps299.pdf 2. Show the dependencies after the tag 2.9.8 (133 patches) time darcs show dep --from-tag=2.9.8 real 0m2.951s user 0m2.865s sys 0m0.082s darcsDesps298.pdf 3. Show the dependencies after the tag 2.9.7 (288 patches) time darcs show dep --from-tag=2.9.7 real 0m26.654s user 0m26.003s sys 0m0.511s darcsDesps297.pdf 4. Show the dependencies after the tag 2.9.6 (358 patches) time darcs show dep --from-tag=2.9.6 real 0m39.019s user 0m38.302s sys 0m0.666s darcsDesps296.pdf 5. Show the dependencies after the tag 2.9.5 (533 patches) time darcs show dep --from-tag=2.9.5 real 1m53.730s user 1m51.343s sys 0m1.939s darcsDesps295.pdf Rushed conclusion, seems the performance is quite good even more if we think that for compute the graph dependencies we calculate the dependencies of "all the selected patches against all the selected patches" and in the case of the option for send would only require to calculate "patches to send against all the selected patches". Marcio Diaz GSoC Progress Report #5: Starting the implementation of darcs undo July 24, 2014 03:16 AM UTC I'm starting with the implementation of the command darcs undo. For now it's quite simple, it just use a copy of the previous hashed_inventory. But, for example, with the current implementation we can undo amend-record: mkdir test cd test/ darcs init Repository initialized. touch f g; darcs add f g Adding 'f' Adding 'g' darcs record addfile ./f Shall I record this change? (1/2) [ynW...], or ? for more options: y addfile ./g Shall I record this change? (2/2) [ynW...], or ? for more options: n Do you want to record these changes? [Yglqk...], or ? for more options: y Finished recording patch 'add f' darcs whatsnew addfile ./g darcs amend-record Thu Jul 24 00:07:53 ART 2014 Marcio Diaz <***> * add f Shall I amend this patch? [yNjk...], or ? for more options: y addfile ./g Shall I record this change? (1/1) [ynW...], or ? for more options: y Do you want to record these changes? [Yglqk...], or ? for more options: y Finished amending patch: Thu Jul 24 00:08:17 ART 2014 Marcio Diaz <***> * add f darcs whatsnew No changes! darcs undo darcs whatsnew addfile ./g darcs changes Thu Jul 24 00:07:53 ART 2014 Marcio Diaz <***> * add f Commands affected: - darcs undo Patches created: http://bugs.darcs.net/patch1182. GSoC Progress Report #4: Garbage Collection for the Global Cache July 24, 2014 02:46 AM UTC The implementation of the command darcs optimize global-cache is almost finished. This command can reduce the size of the global cache and at the same time we can select which files we want to keep in the cache. The command darcs optimize global-cache takes directories as arguments, search repositories within them and delete all files in the global cache that are not being used by these repositories. I'll give an example using the repository darcs.net. The first time you clone this repository, it takes takes approximately 8 minutes: time darcs clone http://darcs.net Welcome to the darcs screened repository. ... Finished cloning. real 8m16.042s user 0m36.568s sys 0m9.807s The second time, because it use the global cache, it takes only 26 seconds: time darcs clone http://darcs.net Welcome to the darcs screened repository. ... Finished cloning. real 0m26.230s user 0m15.293s sys 0m2.891s But we have a pretty heavy cache (we have other repositories besides darcs.net): du -sh ~/.cache/darcs/inventories/ 277M /home/marcio/.cache/darcs/inventories du -sh ~/.cache/darcs/patches/ 325M /home/marcio/.cache/darcs/patches/ du -sh ~/.cache/darcs/pristine.hashed/ 279M /home/marcio/.cache/darcs/pristine.hashed/ If we need to free memory, we can delete the entire cache using: rm -rf ~/.cache/darcs/ but the next time we make a clone of http://darcs.net it'll take again 8 minutes. So, how can we liberate memory from the global cache and at the same time make darcs clones of http://darcs.net that only take 26 seconds? Using darcs optimize global-cache: time darcs optimize global-cache darcs.net Done cleaning global cache! real 0m36.668s user 0m15.492s sys 0m7.932s It takes 36 seconds to clean the cache, leaving only the necessary files for the repository darcs.net: du -sh ~/.cache/darcs/inventories 21M /home/marcio/.cache/darcs/inventories du -sh ~/.cache/darcs/patches/ 69M /home/marcio/.cache/darcs/patches/ du -sh ~/.cache/darcs/pristine.hashed/ 23M /home/marcio/.cache/darcs/pristine.hashed/ time darcs clone http://darcs.net Welcome to the darcs screened repository. ... Finished cloning. real 0m26.173s user 0m15.164s sys 0m2.905s Commands affected: - darcs optimize global-cache Patches created: July 23, 2014 Marcio Diaz GSoC Progress Report #3: Bucketed Global Cache (completed!) July 23, 2014 11:01 PM UTC I finished the implementation of the bucketed cache. Now users can run darcs optimize cache and migrate the old cache to the new bucketed cache. For example, if we have a cache like in the next figure: Running darcs optimize cache makes the cache as follows: This way of organizing the cache allows programs to navigate faster through it. For example, this is what happens if I want to open the folder with patches of my global cache: It takes about 20 seconds to load 10,000 patches: But if we use the bucketed cache the folder of patches is loaded instantly: Because now this folder contain only 256 sub-folders, with about 40 patches each one: Commands affected: - darcs optimize cache Issues solved: Patches created: GSoC Progress Report #2: Bucketed Global Cache July 23, 2014 10:55 PM UTC After my trip to France I'm back working on the Google Summer of Code for darcs. My goals this week are to finish my patches for bucketed cache and garbage collection of the global cache. Bucketed Cache Some programs have problems with directories with a lot of files, for example the ls command in linux. In order to reduce the number of files in a folder in the global cache (which can grow considerably), we decided to implement a bucketed global cache. The patch for the bucketed cache is this: http://bugs.darcs.net/patch1162. This patch transforms the global cache into a bucketed global cache. That is, instead of having all the cache files in a single folder, divided them into sub-folders according to the file name. For example, using the first two digits of the hash, we put the patch named: 0000008516-0048abbb8a2b11870fe24fef48bc2ebb49cbd818a633b8250dc2023e4f6267c9 in the sub-folder /00/, i.e., in ~/.cache/darcs/patches/00/. The same with the inventories and pristine files. However there are several ways to implement this patch: 1. We can forget the old cache (which is located in ~/.darcs/cache/). • The advantage of this approach is that the code is more cleaner and faster (not sure if significantly faster), because we don't need to look for patches at multiple locations. • The disadvantage is that the user will start with an empty cache again. Although we can create a command that is responsible for moving the old cache files to the new format. 2. The other way is to read always both caches. The code of this version a little more complicated because it need to read in several locations. But the user cache remains intact. Different versions of darcs use different global caches: • 2.9.9 (+45 patches) (last development version): this version can read the caches in ~/.darcs/cache/ and ~/.cache/darcs/ (new cache). In addition, each time the old cache is used, the files are linked with the new cache. However this new cache is not bucketed. • 2.8.4 (the latest stable release): used by end users, only uses the cache in ~/.darcs/cache/. The patch in http://bugs.darcs.net/msg17565 change the code of the version 2.9.9 so that, when the old cache is used, darcs link the files read to the new bucketed cache in ~/.cache/darcs/. Also it provides a new command (darcs optimize cache) responsible for moving the files from old caches to the new bucketed cache. Test were performed to see if the bucketed version of the cache improved the performance in darcs commands like clone, but no significant difference was found. Global Garbage Collection This patch should grant mechanisms to the user in order to reduce the size of the global cache according to its needs. By now this patch only count the number of hard links of the files in the global cache. If a file has only one link, it's deleted. This patch hasn't been sent to screened yet. July 15, 2014 Ale Gadea Month of June July 15, 2014 01:43 AM UTC Here goes a little summary of what I been doing between late june (9~21) and early july (1~11). First and easy, I have been documenting Darcs.misplacedPatches (old name chooseOrder), D.P.W.Ordered and D.P.W.Sealed. Something to comment is that the semantics of misplacedPatches, not always can clean a tag doing darcs optimize reorder. For example; Suppose we have a repository, r_1 with the following patches; r_1 = t_{1,0} p_{1,0} t_{1,1} here all tags are clean, but if we make another repository, say r_2, and we pull from r_1 of the following way \$$ darcs pull -a -p$p_{1,0}r_1$(we want to pull the patch$p_{1,0}$, we assume that the name of the patch is$p_{1,0}$for the matching with -p option)$\ darcs pull -a $r_1$

so now we have,

$r_2$ $=$ $p_{1,0}$ $t_{1,0}$ $t_{1,1}$

and we see that $t_{1,0}$ is dirty. Doing darcs optimize reorder not reorder nothing. What is going on is that to know what reorder, misplacePatches takes the first tag, in our case $t_{1,1}$, and
"search" for what patches he don't tag. But $p_{1,0}$ and $t_{1,0}$ are tagged by $t_{1,1}$ so there is nothing to reorder despite $t_{1,0}$ is dirty. Therefore there is no way of clean $t_{1,0}$ because misplacePatches always takes the first tag, so if a tag is tagging one or more dirty tags, this tags never be available to get clean.

"Second", using the implementation of "reorder" one can get almost for free the option --reorder for the commands pull, apply and rebase pull. The behavior for the case of pull (for the others commands is the same basic idea) is that our local patches remain on top after a pull from a remote repository, e.i. suppose we have the followings $l$(ocal) and $r$(emote) repositories,

$l$ $=$ $p_1$ $p_2$ $\ldots$ $p_n$ $lp_{n+1}$ $\ldots$ $lp_m$

$r$ $=$ $p_1$ $p_2$ $\ldots$ $p_n$ $rp_{n+1}$ $\ldots$ $rp_k$

where $lp$ are the local patches that don't belong to $r$, and vice versa for $rp$. Make darcs pull, leaves $l$ as follow,

$l$ $=$ $p_1$ $p_2$ $\ldots$ $p_n$ $lp_{n+1}$ $\ldots$ $lp_m$ $rp_{n+1}$ $\ldots$ $rp_k$

meanwhile make darcs pull --reorder, leaves $l$,

$l$ $=$ $p_1$ $p_2$ $\ldots$ $p_n$ $rp_{n+1}$ $\ldots$ $rp_k$ $lp_{n+1}$ $\ldots$ $lp_m$

making more easy to send the $lp$ patches later.

"Third", beginning a new task, implement option minimize-context for command darcs send. Still no much to comment, I have almost finished implementing the option but with some doubts, I hope that for the end of the week have a more "prettier" implementation as well as a better understanding.

July 04, 2014

the Patch-Tag blog

Patch-tag is shutting down on August 4 2014. Please migrate repos to hub.darcs.net.

July 04, 2014 07:24 PM UTC

Patch-tag users:

I have made the decision to shut down patch tag.

I’ve taken this step because I have stopped developing on patch tag, the site has a material time and money cost, and the technical aspects that made it a valuable learning experience have decreased to the point that I have hit diminishing returns.

The suggested continuity path is to move repos to Simon Michaels’s excellent hub.darcs.net.

To end on a positive note, I would like to say: no regrets! Creating patch tag was a definite high point of my career, opened valuable doors, and engendered even more valuable partnerships and collaborations. To my users and everyone who has helped, you are awesome and it was a lot of fun seeing the repos come online.

I may write a more in depth post mortem at a later time, but for now I just wanted to make a public statement and nudge remaining users to take appropriate action.

If there is anybody that would like to take over patch tag operations to keep the site going, I am open to handing over the reins so don’t be shy. I floated this offer among some private channels in the darcs community a while back, and the response then was… not overwhelming. But maybe the public announcement will bring in some new blood.

Thanks for using patch tag.

Happy tagging,

Thomas Hartman

June 25, 2014

Darcs News

darcs news #104

June 25, 2014 04:59 AM UTC

News and discussions

1. Google Summer of Code 2013 has begun! BSRK and José will post updates on their blogs:

Issues resolved (8)

issue2163 Radoslav Dorcik
issue2227 Ganesh Sittampalam
issue2248 Ganesh Sittampalam
issue2250 BSRK Aditya
issue2311 Sebastian Fischer
issue2312 Sebastian Fischer
issue2320 Jose Luis Neder
issue2321 Jose Luis Neder

Patches applied (20)

See darcs wiki entry for details.

darcs news #105

June 25, 2014 04:58 AM UTC

News and discussions

1. This year's Google Summer of Code projects brought a lot of improvements to darcs and its ecosystem!
• BSRK Aditya: Darcsden improvements:
• José Neder: patience diff, file move detection, token replace detection:
2. Gian Piero Carrubba asked why adjacent hunks could not commute:
3. We listed the changes that occurred between version 2.8.4 and the current development branch into a 2.10 release page:

Issues resolved (8)

issue346 Jose Luis Neder
issue1828 Guillaume Hoffmann
issue2181 Guillaume Hoffmann
issue2309 Owen Stephens
issue2313 Jose Luis Neder
issue2334 Guillaume Hoffmann
issue2343 Jose Luis Neder
issue2347 Guillaume Hoffmann

Patches applied (39)

See darcs wiki entry for details.

Darcs News #106

June 25, 2014 04:58 AM UTC

News and discussions

1. Darcs is participating once again to the Google Summer of Code, through the umbrella organization Haskell.org. Deadline for student application is Friday 21st:
2. It is now possible to donate stock to darcs through the Software Freedom Conservancy organization. Donations by Paypal, Flattr, checks and wire transfer are still possible:
3. Dan Licata wrote a presentation about Darcs as a higher inductive type:
4. Darcs now directly provides import and export commands with Git. This code was adapted from Petr Rockai's darcs-fastconvert, with some changes by Owen Stephen from his Summer of Code project "darcs-bridge":

Issues resolved (6)

issue642 Jose Luis Neder
issue2209 Jose Luis Neder
issue2319 Guillaume Hoffmann
issue2332 Guillaume Hoffmann
issue2335 Guillaume Hoffmann
issue2348 Ryan

Patches applied (34)

See darcs wiki entry for details.

Darcs News #107

June 25, 2014 04:57 AM UTC

News and discussions

1. Darcs has received two grants from the Google Summer of Code program, as part of the umbrella organization Haskell.org. Alejandro Gadea will work on history reordering:
2. Marcio Diaz will work on the cache system:
3. Repository cloning to remote ssh hosts has been present for years as darcs put. This feature has now a more efficient implementation:

Issues resolved (11)

issue851 Dan Frumin
issue1066 Guillaume Hoffmann
issue1268 Guillaume Hoffmann
issue1416 Ale Gadea
issue1987 Marcio Diaz
issue2263 Ale Gadea
issue2345 Dan Frumin
issue2357 Dan Frumin
issue2365 Guillaume Hoffmann
issue2367 Guillaume Hoffmann
issue2379 Guillaume Hoffmann

Patches applied (41)

See darcs wiki entry for details.

Darcs News #108

June 25, 2014 04:57 AM UTC

News and discussions

1. We have a few updates from the Google Summer of Code projects. Alejandro Gadea about history reordering:
2. Marcio Diaz about the cache system:
3. Incremental fast-export is now provided to ease maintenance of git mirrors:

Issues resolved (8)

issue2244 Ale Gadea
issue2314 Benjamin Franksen
issue2361 Ale Gadea
issue2364 Sergei Trofimovich
issue2364 Sergei Trofimovich
issue2388 Owen Stephens
issue2394 Guillaume Hoffmann
issue2396 Guillaume Hoffmann

Patches applied (39)

See darcs wiki entry for details.

June 12, 2014

Ale Gadea

Third Week (02-06 june)

June 12, 2014 04:58 PM UTC

Well, well... Now with the solution already implemented here are a couple of time tests that show the improvement.

For the repository of the issue2361:

Before patch1169
"let it run for 2 hours and it did not finish"

After patch1169
real    0m5.929s
user    0m5.683s
sys     0m0.260s

For the repository generated by forever.sh, that in summarize has 12600~ patches, a bundle unrevert and doing reorden implies move 1100~ patches forward passing by 11500~ patches.

Before patch1169
(Interrupted!)
real    73m9.894s
user    71m28.256s
sys     1m11.439s

After patch1169
real    2m23.405s
user    2m17.347s
sys     0m6.030s

The repository generated by bigRepo.sh has 600~ patches, with only one tag and a very small bundle unrevert.

Before patch1169
real        0m34.049s
user        0m33.386s
sys         0m0.665s

After patch1169
real        0m1.053s
user        0m0.960s
sys         0m0.152s

One last repository generated by bigUnrevert.sh, has 13 patches and a really big bundle unrevert (~10MB).

Before patch1169
real    0m1.304s
user    0m0.499s
sys     0m0.090s

After patch1169
real    0m0.075s
user    0m0.016s
sys     0m0.011s

The repository with more examples is in here: ExamplesRepos.

June 05, 2014

Ale Gadea

Second Week (26-30 may)

June 05, 2014 06:47 PM UTC

Luckily, this week with Guillaume we found a "solution" for the issue 2361. But before of entering in details, let's review how the command darcs optimize --reorder does for reorder the patches.

So, suppose we have the following repositories than, reading it from left to right we have the first patch till the last patch, besides with $p_{i,j}$ we denote the $i$-th patch who belongs to the $j$-th repository, and when we want to specify that a patch $p_{i,j}$ is a tag we write $t_{i,j}$.

$r_1$ $=$ $p_{1,1}$ $p_{2,1}$ $\ldots$ $p_{n,1}$ $p_{n+1,1}$ $\ldots$ $p_{m,1}$

$r_2$ $=$ $p_{1,1}$ $p_{2,1}$ $\ldots$ $p_{n,1}$ $p_{1,2}$ $\ldots$ $p_{k,2}$ $t_{1,2}$ $p_{k+1,2}$ $\ldots$ $p_{l,2}$

where the red part represent when $r_2$ was cloned from $r_1$, and the rest is how each repository was evolved. Now, suppose we make a merge of $r_1$ and $r_2$ in $r_1$ making a bundle of the patches of $r_2$ and appling it in $r_1$. Thus, after the merge we have that

$r_1$ $=$ $p_{1,1}$ $p_{2,1}$ $\ldots$ $p_{n,1}$ $p_{n+1,1}$ $\ldots$ $p_{m,1}$ $p_{1,2}$ $\ldots$ $p_{k,2}$ $t_{1,2}$ $p_{k+1,2}$ $\ldots$ $p_{l,2}$

and we found the situation where the tag $t_{1,2}$ is dirty because the green part is in the middle. And now we are in conditions of finding out how darcs does the reorder of patches.
So, the first task is to select the first tag seeing $r_1$ in the reverse way, suppose $t_{1,2}$ is the first (ie, $p_{k+1,2}$ $\ldots$ $p_{l,2}$ are not tags), and split the set of patches (the repository) in

$ps_{t_{1,2}}$ $=$ $p_{1,1}$ $p_{2,1}$ $\ldots$ $p_{n,1}$ $p_{1,2}$ $\ldots$ $p_{k,2}$ $t_{1,2}$

and the rest of the patch set,

$rest$ $=$ $p_{n+1,1}$ $\ldots$ $p_{m,1}$ $p_{k+1,2}$ $\ldots$ $p_{l,2}$

this is done by splitOnTag, which I don't totally understand yet, so for the moment... simply do the above :) Then, the part that interest us now is $rest$, we want to delete all the patches of $rest$ that exist in $r_1$ and then add them again, causing that they show up to the right. This job is done by tentativelyReplacePatches, which first calls tentativelyRemovePatches and then calls tentativelyAddPatches.

So, tentativelyRemovePatches of $r_1$ and $rest$ makes,

$r_{1}'$ $=$ $p_{1,1}$ $p_{2,1}$ $\ldots$ $p_{n,1}$ $p_{1,2}$ $\ldots$ $p_{k,2}$ $t_{1,2}$

and, tentativelyAddPatches of $r_{1}'$ and $rest$,

$r_{1}''$ $=$ $p_{1,1}$ $p_{2,1}$ $\ldots$ $p_{n,1}$ $p_{1,2}$ $\ldots$ $p_{k,2}$ $t_{1,2}$ $p_{n+1,1}$ $\ldots$ $p_{m,1}$  $p_{k+1,2}$ $\ldots$ $p_{l,2}$

leaving $t_{1,2}$ clean.

Well, all of this was for understanding the "solution" for the issue, we are almost there but before let's look at the function tentativelyRemovePatches. It attempts to remove patches with one special care: when one does darcs revert, a special file is generated, called unrevert in _darcs/patches, which is used for darcs unrevert in case that one makes a mistake with darcs revert. One important difference with unrevert is that unlike all the other files in _darcs/patches, unrevert in not a patch but a bundle, that contains a patch and a context. This context allows to know if the patch is applicable. So when one removes a patch (running for example oblitarete, unrecord or amend) that patch has to be removed from the bundle-revert (bundle of the file _darcs/patches/unrevert). It's now always possible to adjust the unrevert bundle, in which case, the operation continues only if the user agrees to delete the unrevert bundle.

But now a question emerge. Is it necessary to accommodate the bundle-revert in the case of reorder?; the answer is no, and it's because we don't delete any patch of $r_1$ so we still can apply the bundle-revert in $r_{1}''$.

So, finally! we find out that for reorder we need a special case of removing, which doesn't try to update the unrevert bundle. And this ends up being the "solution" for the issue, since the reorder blocks in that function. But! beyond this solves the issue something weird is happening, that is the reason of the double quotes for solution :)

This is more o less the step forward for now. The tasks ahead are, documenting the code in various parts and make the special case for the function tentativelyRemovePatches. On the way I will probably understand more about some of the functions that I mention before so probably I will add more info and rectify whatever is needed.

June 03, 2014

Ale Gadea

Google Summer of Code 2014 - Darcs

June 03, 2014 06:46 PM UTC

Hi hi all!

I have been accepted in the GSoC 2014 :) , as part of the work I'll be writing about my progress. The original plan is have a summary per week (or at least I hope so jeje).

I have already been reading some of the code of darcs and fixing some issues;

Issue 2263 ~ Patch 1126
Issue 1416 ~ Patch 1135
- Issue 2244 ~ Patch 1147 (needs-screening) (not any more $\ddot\smile$)

The details about the project is in History Reordering Performance and Features. Also some issues about the project are;

Issue 2361
Issue 2044

Cheers!

First Week (19-23 may)

June 03, 2014 06:42 PM UTC

Sadly, a first slow week, I lost the monday with problems with my notebook for which I have to reinstall ghc, cabal, all the libraries, etc.. but! in the end this helped :)

The list of taks of the week include:

1. Compile and run darcs with profiling flags
2. Write scripts to generate dirty-tagged big repositories
3. Check memory usage with hp2any for the command optimize --reorder for the
generated repositories and repo-issue2361
4. Check performance difference with and without patch-index
5. Document reorder implementation on wiki
6. Actually debug/optimize reorder of issue2361 (Stretch goal)

1. Compile and run darcs with prolfiling flags

This seems pretty easy at first, but turned somewhat annoying because one have to install all the libraries with the option profiling. So a mini-step-by-step of the my installation of darcs with profiling
flags is (i'm using ubuntu 14.04, ghc-7.6.3 and cabal-install-1.20.0.2) :

- Install ghc-prof package, in my case with sudo apt-get install ghc-prof
- Install depencencies of darcs with enable-library-profiling, doing:
- \$ cabal install LIB --enable-library-profiling ( for each library :) )
- or setting in ~/.cabal/config, library-profiling: True
- Finaly install darcs with enable-library-profiling and enable-executable-profiling

2. Write scripts to generate dirty-tagged big repositories

About this no much to say, I did some libraries to make the scripts that generates the repositories more straightforward. And I wrote some examples, but still in search of interesting examples. A long the week probably I will add examples, hopefully interesting.

3, 4 and 5 all together and mixed

Now, when finally start to generate the examples repositories and play with hp2ps to check differents things, I started to think about others things and I ended up studing the implementation of the command optimize --reorder, in particular I start to write a version which print some info during the ordering of patches, but for now is very dirty implementation.

April 27, 2014

Marcio Diaz

GSoC Progress Report #1: Complete Repository Garbage Collection

April 27, 2014 05:06 AM UTC

In my first week I worked on completing the garbage collection for repositories.

Darcs stores all the information needed under _darcs directory. In this part of the project we are only interested in the files stored in three directories:

• _darcs/patches/: stores the patches.
• _darcs/pristine.hashed/: stores the last saved state of working copy.
•  _darcs/inventories/: stores the inventories (lists of patches).
While working on a project under version control, these directories grow in size.
Every time we record a new patch:
• A new inventory file is stored in _darcs/inventories/ containing the augmented list of patches. Now, the old inventory file (without the new patch) is no longer needed (this is true in most cases).
• A new patch file is stored in darcs/patches/. If we later unrecord this patch, the patch file is no longer needed.
• The same happens with _darcs/pristine.hashed/.

So, why do we keep these files if we no longer need them? Well, that’s because darcs wants to be fast and does not delete these files over time. Also it’s because if the repository is public and someone is cloning it, you don’t want to have some files disappearing in the process.

Darcs, using "darcs optimize" command, only knows how to clean up the _darcs/pristine.hashed directory. Until now, the only way to clean the other two directories was doing a "darcs get". With the changes introduced, now "darcs optimize" also clean these directories.

Algorithms:

The implemented algorithm was pretty straightforward, in pseudo-code:

- inventory = _darcs/hashed_inventory
- while (inventory)
- useful_inventories += inventory
- inventory = next_inventory(inventory)
- remove files not in useful_inventories.

- inventory = _darcs/hashed_inventory
- while (inventory)
- useful_patches += get_patches(inventory)
- inventory = next_inventory(inventory)
- remove files not in useful_patches.

We can see that we travel the inventory list twice, one for inventories and one for the patches. Although this is not optimal, I think it is more modular, since now we have a function that gets the list of patches.

Commands affected:

- darcs optimize

Use cases:

It is useful when you need to free memory on your hard disk.
For example:
- Record a new patch.
- Unrecord the new patch.
- Run optimize for garbage collecting the unused files corresponding to the unrecorded patch. Details in: http://pastebin.com/vYHiYV0F
You can find more use cases in the regression test script:

Issues solved:

Patches created:

http://bugs.darcs.net/patch1134.

April 26, 2014

Marcio Diaz

GSoC project accepted

April 26, 2014 09:36 PM UTC

I was accepted for the Google Summer of Code 2014. I'll be working for Haskell.org and my project will focus on improvements of Darcs version control system.

The project consists on several parts:

1. Complete garbage collection for repositories.
2. Bucketed global cache.
3. Garbage collection of global cache.
4. Investigate and implement darcs undo command.
5. Investigate and implement darcs undelete command.
Here is a detailed description of my project proposal: http://darcs.net/GSoC/2014-Hashed-Files-And-Cache.

I'll try to give weekly updates of how my work is going, and let you know about the problems and solutions that I find in my way.

Thanks Haskell.org, thanks Darcs and last but not least thanks Google for giveng us this awesome opportunity.

November 03, 2013

Simon Michael

darcsum 1.3

November 03, 2013 07:38 PM UTC

darcsum was hanging again, so I made some updates:

• Fix a hang when reverting, when darcs responds with “Will not ask whether to revert this already decided patch…”.

• Fixed an error in at least my local darcsum, which caused it to break when darcsum-debug was enabled.

• Fixed the four warnings my emacs gave when byte-compiling it. These fixes could use some testing.

• Reviewed the status and backlog. Last release was 2010, the ELPA package dates from 2012, there’s a bunch of unreleased fixes, the site script needs updating for hakyll 4, the project still needs a maintainer.

And since I came this far, I’ll tag and announce darcsum 1.3. Hurrah!

This release includes many fixes from Dave Love and one from Simon Marlow. Here are the release notes.

Site and ELPA package updates will follow asap. All help is welcome.

September 26, 2013

Simon Michael

darcsden/darcs hub GSOC complete

September 26, 2013 11:48 AM UTC

Aditya BSRK’s darcsden-improvement GSOC has concluded, and I’ve recently merged almost all of the pending work and deployed it on darcs hub.

You can always see the recently landed changes here, but let me describe the latest features a little more:

File history - when you browse a file, there’s a new “file changes” button which shows just the changes affecting that file.

File annotate - there’s also a new “annotate” button, providing the standard view showing which commit last touched each line of the file. (also known as the blame/praise feature). It needs some CSS polish but I’m glad that the basic side-by-side layout is there.

More reliable highlighting while editing - the file editor was failing to highlight many common programming languages - this should be working better now. (Note highlighting while viewing and highlighting while editing are independent and probably use different colour schemes, this is a known open wishlist item.)

Repository compare - when viewing a repo’s branches, there’s a new “compare” button which lets you compare (and merge from) any two public repos on darcs hub, showing the unique patches on each side.

Cosmetic fixes - various minor layout and rendering issues were fixed. One point of discussion was whether to use the two-sided layout on the repo branches page as well. Since there wasn’t time to make that really usable I vetoed it in favour of the less confusing one-sided layout. I think showing both sides works well on the compare page though.

Patch bundle support - the last big feature of the GSOC was patch bundles. This is an alternative to the fork repo/request merge workflow, intended to be more lightweight and easy for casual contributors. There are two parts. First, darcs hub issue trackers can now store darcs patch bundle files (one per issue I think). This means patches can be uploaded to an issue, much like the current Darcs issue/patch tracker. But you can also browse and merge patches directly from a bundle, just as you can from another repo.

The second part (not yet deployed) is support for a previously unused feature built in to the darcs send command, which can post patches directly to a url instead of emailing them. The idea (championed by Aditya and Ganesh) is to make it very easy for someone to darcs send patches upstream to the project’s issue tracker, without having to fork a repo, or even create an account on darcs hub. As you can imagine, some safeguards are important to avoid becoming a spam vector or long-term maintenance headache, but the required change(s) are small and I hope we’ll have this piece working soon. It should be interesting to have both workflows available and see which works where.

I won’t recap the older new features, except to say that pack support is in need of more testing. If you ever find darcs get to be slow, perhaps you’d like to help test and troubleshoot packs, since they can potentially make this much faster. Also there are a number of low-hanging UI improvements we can make, and more (relatively easy) bugs keep landing in the darcs hub/darcsden issue tracker. It’s a great time to hack on darcs hub/darcsden and every day make it a little more fun and efficient to work with.

I really appreciate Aditya’s work, and that of his mentor, Ganesh Sittampalam. We did a lot of code review which was not always easy across a large time zone gap, but I think the results were good. Congratulations Aditya on completing the GSOC and delivering many useful features, which we can put to good use immediately. Thanks!

September 20, 2013

Jose Luis Neder

Automatic detection of replaces for Darcs - Part 1

September 20, 2013 03:25 PM UTC

In the last post i show some examples and use cases of the "--look-for-replaces" flag for whatsnew, record, and amend-record commands in Darcs. When used, this flag provides automatic detection of replaces(possible ones), even when the modified files shows more differences than only the replaces, and even shows possible "forced" replaces.
The simplest case is when you made a replace in you editor in of choice and don't do any other change to the file and then, after check all is ok, remember that you could have used a replace patch.

file before:
line1 foo
line2 foo
line3 foo
file after:
line1 bar
line2 bar
line3 bar
Then, instead of:
> darcs revert -a file
Reverting changes in "file":

Finished reverting.
> darcs replace foo bar file
> darcs record -m "replace foo bar"
replace ./file [A-Za-z_0-9] foo bar
Shall I record this change? (1/1) [ynW...], or ? for more options: y
Do you want to record these changes? [Yglqk...], or ? for more options: y
Finished recording patch 'replace foo bar'
You could do:
> darcs record --look-for-replaces -m "replace foo bar"
replace ./file [A-Za-z_0-9] foo bar
Shall I record this change? (1/1) [ynW...], or ? for more options: y
Do you want to record these changes? [Yglqk...], or ? for more options: y
Finished recording patch 'replace foo bar'
But it doesn't have to be a full replace. For instance, if you don't want to change a pair replaces, when you try to detect the changes instead of:
file before:
line1 foo
line2 foo
line3 foo
line4 foo
file after:
line1 bar
line2 bar
line3 bar
line4 foo
Then, instead of:
> darcs whatsnew
hunk ./file 1
-line1 foo
-line2 foo
-line3 foo
+line1 bar
+line2 bar
+line3 bar
With the new flag you could record this:
> darcs whatsnew --look-for-replaces
replace ./file [A-Za-z_0-9] foo bar
hunk ./file 4
-line4 bar
+line4 foo
Say you replace a word for another word that was already in the file. Normally this would mean that you should use "darcs replace --force". The look-for-replaces flag always "forces" the replaces, so if you try this, the changes to make the replace reversible will be shown before the replace patch:
file before:
line1 foo
line2 foo
line3 foo
line4 bar
file after:
line1 bar
line2 bar
line3 bar
line4 bar
With the new flag you will see the same patches like if you have made a "darcs replace --force foo bar file":
> darcs whatsnew --look-for-replaces
hunk ./file 4
-line4 bar
+line4 foo
replace ./file [A-Za-z_0-9] foo bar
Given certain limitations you could have any number of replaces detected, like this:
file before:
foo foo2 foo3
fee fee2 fee3
file after:
bar bar2 bar3
bor bor2 bor3
All the replaces are shown below:
> darcs whatsnew --look-for-replaces
replace ./file [A-Za-z_0-9] fee bor
replace ./file [A-Za-z_0-9] fee2 bor2
replace ./file [A-Za-z_0-9] fee3 bor3
replace ./file [A-Za-z_0-9] foo bar
replace ./file [A-Za-z_0-9] foo2 bar2
replace ./file [A-Za-z_0-9] foo3 bar3
If you want to know more about the limitations of this functionality, check Automatic detection of replaces for Darcs - Part 2.

Automatic detection of replaces for Darcs - Part 2

September 20, 2013 09:08 AM UTC

The last weeks i was implementing "--look-for-replaces" flag for whatsnew, record, and amend-record commands in Darcs. When used, this flag provides automatic detection of replaces(possible ones) even when the modified files shows more differences than only the replaces, given they meet the following prerequisites:
1. For a given "word" and a given file, there is not need for all the instances to be replaced, but there must be only one replace suggestion posible. i.e.:

this is ok:
file before:
foo
foo
foo
file after:
foo
bar
bar
this is not detected:
file before:
foo
foo
foo
file after:
foo
bar
bar2
2. The replace must happen in lines that have the same amount of words between the recorded and the working state, otherwise it would not be detected.
this is ok:
file before:
foo
foo
foo
file after:
foo roo
bar fee
bar
this is not detected(i don't know which is to detect anyway):
file before:
figaro foo
figaro foo
figaro foo
file after:
figaro foo
figaro bar bee
figaro foo bar
3. There must be at least one hunk with the same amount of lines in the - and + side that contains the replace.
this is not detected:
file before:
line1 foo
line2 foo
line3 foo
file after:
line1 bar
line2or3 bar
It would not detect this replace, even if it is a "perfect" replace, because it does not have the same number of lines, and is not trivial to tell which line is the one "modified" and which one is the one "deleted".

For more details about the implementation you could look on the look-for-replaces wiki page

Automatic detection of file renames for Darcs - Part 2

September 20, 2013 09:07 AM UTC

In the last few weeks i was refining the automatic detection of file renames implementation adding support for windows, and support for more complicated renames.

Now if you like you can consult the inode information saved in the index at any time with "darcs show index":
⮁ darcs init
⮁ mkdir testdir
⮁ touch testfile
⮁ darcs record -al -m "test files"
Finished recording patch 'test files'
⮁ ls -i1d . testdir testfile
2285722 .
2326707 testdir
2238437 testfile

⮁ darcs show index
07ec6ccf873cf215ac0789a420f154ba9218b7ca5c4fce432584edab49766a7c 2285722 ./
e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855 2326707 testdir/
e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855 2238437 testfile
Now with the new dependency algorithm, you can make more complicated renames, like exchange of filenames, folder moves. The algorithm don't manage exchange of filenames inside of a folder that have been renamed exchanging names, anything else is managed fine.
For example:
⮁ ls -1pC
_darcs/  dir/  dir2/  dir3/  foo  foo2  foo3  foo4  foo5
⮁ mv foo dir3
⮁ mv foo2 dir
⮁ mv foo3 dir2
⮁ mv foo4 foo4.tmp
⮁ mv foo5 foo4
⮁ mv foo4.tmp foo5
⮁ mv dir3 dir
⮁ mv dir dir2/dir2
⮁ mv dir2 dir
⮁ darcs whatsnew --look-for-moves
move ./dir ./dir2/dir2
move ./dir2 ./dir
move ./dir3 ./dir/dir2/dir3
move ./foo ./dir/dir2/dir3/foo3
move ./foo2 ./dir/dir2/foo2
move ./foo3 ./dir/foo3
move ./foo4 ./foo4.tmp~
move ./foo5 ./foo4
move ./foo4.tmp~ ./foo5
The moves shown by "darcs whatsnew --look-for-moves" are not exactly the ones made but yield the same final result.

August 14, 2013

Jose Luis Neder

Automatic detection of file renames for Darcs

August 14, 2013 04:29 AM UTC

In the last few weeks i was implementing automatic detection of file renames adding "look-for-moves" flag to the amend-record, record, and whatsnew commands.

In darcs are 3 states:

• The recorded state is the one is marked by the last record made.
• The working state is the actual state of the files in the repository with all the last changes.
• The pending state is the one that mark changes like file adds, moves, replaces, etc, before they are recorded. Is a temporal state between recorded and working that let darcs know about what filenames to track, and changes that are not common like replaces.

If a file rename is not marked in the pending state, darcs lost track of the file and can't know where it is, and then darcs whatsnew and darcs record will indicate the file as deleted.
To detect this file rename I choose to use the inode info in the filesystem to check for equality between different filenames in the recorded and working state of the repo. for those who don't know, the inode is an index number assigned by the file system to identify a specific file data. The file name is linked to the data by this number, and it's used by directories as well. You can consult this number with "ls -i".
⮁ mkdir testdir
⮁ touch testfile
⮁ ln testfile testfile.hardlink
⮁ ln -s testfile testfile.symboliclink
⮁ ls -i1
10567718 testdir
10485776 testfile
10485776 testfile.hardlink
10485767 testfile.symboliclink
You can see that the hardlink shares the same number with the test file, this is because a file is essentially a hardlink to the file data and when you make a new hardlink you are sharing the same file data, so the same inode number.
To have an old inode to filename mapping, there must be some record of the files inodes in some place, so I added the inode info to the index of hashed-storage in _darcs/index. The index save the last info about the record plus the pending state, sort of, so is a perfect fit to save this info.
Then comparing the RecordedAndPending Tree(from the index) with the Working Tree i get the file changes in a pair list mapping between the two states. With this list I resolve dependencies between the different moves, making temporal names if it's necessary and generating a FL list of move patches to merge with the changes between pending and working patches.
This patches are shown in with whatsnew or are selected with record/amend-record to be recorded in the repo.
There is a little more to make this happen but that's the core idea of the implementation.
The algorithm doesn't care if the file are modified or not, because it doesn't care of the content of the files, so it's very robust in that sense.
With this implementation you could do any move directly with "mv", and is very lightweight and fast in detecting moves so is likely a good decision make "--look-for-moves" a default flag. You could do things like this:
⮁ darcs init
Repository initialized.
touch foo
darcs record -a -m add_file_foo -A x --look-for-adds
Finished recording patch 'add_file_foo'
mv foo foo2
darcs whatsnew --look-for-moves
move ./foo ./foo2
This doesn't work on Windows yet, because fileID(the function on unix-compat that get the inode number) is lacking an implementation on windows. I know the windows API have GetFileInformationByHandle (it returns a BY_HANDLE_FILE_INFORMATION structure that contains the file index number[1]), so there doesn't have to be too hard to add an implementation of it with some boilerplate code to make the interface.
More complicated moves should work and some does but I was having problems with the dependency resolving algorithm implementation. I made some mistakes in the first implementation and I'm dragging them since then. I'm confident to know what is the error so I will fix it soon.
UPDATE: i'm testing a windows implementation with the Win32 haskell library on a virtual machine.

August 09, 2013

Simon Michael

darcs hub, hledger, game dev

August 09, 2013 10:01 AM UTC

Hello blog. Since last time I’ve been doing plenty of stuff, but not telling you about it. Let’s do a bullet list and move on..

darcsden/darcs hub

hledger

• expanded the docs for conditional blocks (if statement) in CSV rules
• added an include directive to allow sharing of common CSV rules

FunGEn & game dev

A sudden burst of activity here.

• schmoozed in #haskell-game, got caught up on haskell game development, updated the Games wiki page a bit
• updated and published my template for SDL projects on OSX: hssdl-osx-template
• continued some FunGEn updates I’ve been sitting for a year and did a release - see next post.

July 24, 2013

Simon Michael

darcs hub repo stats, hledger balance sheet

July 24, 2013 02:50 AM UTC

Recent activity:

I fixed another clumsy query on darcs hub, making the all repos page faster. Experimented with user and repo counts on the front page. I like it, but haven’t deployed to production yet. It costs about a quarter of a second in page load time (one 50ms couch query to fetch all repos, plus my probably-suboptimal filtering and sorting).

I’ve finally learned how many of those names on the front page have (public) repos behind them (144 out of 319), and how many private repos there are (125, higher than expected!).

Thinking about what is really most useful to have on the front page. Keep listing everything ? Just top 5 in various categories ? Ideas welcome.

Did a bunch of bookkeeping today, which inspired my first hledger commit in a while. I found the balancesheet command (abbreviation: bs) highly useful for a quick snapshot of assets and liabilities to various depths (add –depth N). The Equity section was just a distraction though, and I think it will be to most hledger users for the time being, so I removed it.

July 23, 2013

Simon Michael

hub hacking

July 23, 2013 12:30 AM UTC

More darcs hub activity, including some actual app development (yay):

Added news links to the front page.

Cleaned up hub’s docs repo and updated the list of blockers on the roadmap.

Updated/closed a number of issues, including the app-restarting #58, thanks to a fast highlighting-kate fix by John McFarlane.

Tested and configured the issue-closing commit posthook in the darcsden trunk repo. Commits pushed/merged there whose message contains the regex (closes #([0-9]+)|resolves #([0-9]+)|fixes #([0-9]+)) will now close the specified issue, with luck.

Consolidated a number of modules to help with code navigation, to be pushed soon.

Improved the redirect destination when deleting or forking repos or creating/commenting/closing issues.

Fixed a silly whitespace issue when viewing a patch, where the author name and date run together. I’m still confused about the specific code that generates this - the code I expect uses tables but firebug shows divs. A mystery for another day..

July 22, 2013

Simon Michael

hub speedups

July 22, 2013 12:30 AM UTC

More darcs hub hacking today.

• Cleaned up some button text & styles

• Started playing around with the front page layout to add a news box. Accidentally deployed it to production briefly. Used make undeploy-web for the first time.

• Extracted more of darcs hub’s front page into a separate (local) module, DarcsDen.Hub, for easier customisation.

• Worked on optimising page load speed, especially the front page:

• instead of rendering the front page content from markdown on each request, do it only when changed. This saved, oh a good 50ms! But I now have the beginnings of a timing utility.

• load the large codemirror js/css files only on the add/edit file pages

• set a 7-day Expires header on static images. This took about an hour. Here’s the code (based on Yesod’s), it might fit well in Snap.

• combine the remaining js/css into one js and one css. This is not committed, I just did it by hand. Try less later (only if it’s really simple).

YSlow now gives us an A grade (score 95), and it feels pretty quick.

July 21, 2013

Simon Michael

darcsden 1.1, darcs hub news

July 21, 2013 03:00 PM UTC

I’ve been hacking (mostly on darcsden/hub) but not blogging recently. Must get back to the old 45-15 minute routine.

• banged on hakyll for a while and set up tag feeds on this blog; joined Planet Darcs

• did code review, testing, deployment of BSRK Aditya’s GSOC enhancements

• spent about 8 hours clarifying darcsden history and writing release notes/announcement. Seriously ? Apparently yes.

• moved the darcs hub FAQ back to the front page, cleaned it up and added some javascript magic.

• fixed the slow user list on the front page - it was doing a query for each user. My effective page load time went from ~2 to 1s. Reducing the number of scripts will be a good next step.

• released darcsden 1.1 and the first darcs hub news update. Yay!

This packages up what we have been using at hub.darcs.net so that you can run it locally. It’s the first darcsden release installable from hackage, and the first with the UI updates from darcs hub. For now, it still requires CouchDB and Redis to run.

More importantly, this is about communicating the changes and current status of darcs hub, and doing a bit of marketing. Darcs hub hacking is fun, come and help! I include the announcement below.

darcsden 1.1 released

darcsden 1.1 is now available on hackage! This is the updated version of darcsden which runs hub.darcs.net, so these changes are also relevant to that site’s users. (More darcs hub news below.)

darcsden is a web application for browsing and managing darcs repositories, issues, and users, plus a basic SSH server which lets users push changes without a system login. It is released under the BSD license. You can use it:

• to browse and manage your local darcs repos with a more comfortable UI
• to make your repos browsable online, optionally with issue tracking
• to run a multi-user darcs hosting site, like hub.darcs.net

http://hackage.haskell.org/package/darcsden - cabal package
http://hub.darcs.net/simon/darcsden - source
http://hub.darcs.net/simon/darcsden/issues - bug tracker

Release notes for 1.1

Fixed:

• 16: Layout of links and navigation places them offscreen
• 21: anchors on line numbers exist but line numbers not clickable
• 28: forking then deleting a private repo makes repos unviewable
• 29: darcs get to an invalid ssh repo url hangs
• 46: if user kills a push, the lock file is not removed, preventing subsequent pushes

New:

• the signup page security question is case-insensitive (“darcs”)
• login redirects to the “my repos” page
• a more responsive layout, with content first, buttons at top/right
• many other UI updates; font, headings, borders, whitespace, robustness
• more context sensitivity in buttons & links
• better next/previous page controls
• better support for microsoft windows, runs as a service
• builds with GHC 7.6 and latest libraries
• easier developer builds

Brand new, from the Enhancing Darcsden GSOC (some WIP):

• you can sign up, log in, and link existing accounts with your Google or Github id
• you can reset your password
• you can edit files through the web
• you can “pack” your repositories, allowing faster darcs get

Detailed change log: http://hub.darcs.net/simon/darcsden/CHANGES.md

How to help

darcsden is a small, clean codebase that is fun to hack on. Discussion takes place on the #darcs IRC channel, and useful changes will quickly be deployed at hub.darcs.net, providing a tight dogfooding/feedback loop. Here’s how to contribute a patch there:

1. register at hub.darcs.net
2. add your ssh key in settings so you can push
3. fork your own branch: http://hub.darcs.net/simon/darcsden , fork
4. copy to your machine: darcs get http://hub.darcs.net/yourname/darcsden
5. make changes, darcs record
6. push to hub: darcs push yourname@hub.darcs.net:darcsden --set-default
7. your change will appear at http://hub.darcs.net/simon/darcsden/patches
8. discuss on #darcs, or ping me (sm, simon@joyful.com) to merge it

Credits

Alex Suraci created darcsden. Simon Michael led this release, which includes contributions from Alp Mestanogullari, Jeffrey Chu, Ganesh Sittampalam, and BSRK Aditya (sponsored by Google’s Summer of Code). And last time I forgot to mention two 1.0 contributors: Bertram Felgenhauer and Alex Suraci.

darcsden depends on Darcs, Snap, GHC, and other fine projects from the Haskell ecosystem, as well as Twitter Bootstrap, JQuery, and many more.

darcs hub news 2013/07

http://hub.darcs.net , aka darcs hub, is the darcs repository hosting site I operate. It’s like a mini github, but using darcs. You can:

• browse users, repos, files and changes
• publish darcs repos publicly or privately
• get, push and pull repos over ssh
• grant push access to other members
• fork repos, then view and merge upstream and downstream changes
• track issues

The site was announced on 2012/9/15 (http://thread.gmane.org/gmane.comp.version-control.darcs.user/26556). Since then:

• The site has been deploying new darcsden work promptly; it includes all the 1.1 release improvements described above.

• The server’s ram has doubled from 1G to 2G (thanks Linode). This means app restarts due to excessive memory use are less frequent.

• The front page’s user list had become slow and has been optimised, halving the page load time.

• BSRK Aditya is doing his Google Summer of Code project on enhancing darcsden and darcs hub (mentored by darcs developer Ganesh Sittampalam). Find out more at http://darcs.net/GSoC/2013-Darcsden .

• The site is being used, with many small projects and a few well-known larger ones. Quick stats as of 2013/07/19:

user accounts                       317
repos                               579
disk usage                            2.5G
uptime last 30 days                  99.48%
average response time last 30 days    1.6s
• The site remains free to use, including private repos. Eventually, some kind of funding will be needed to keep it self-sustaining, and could also enable faster development. Donate button ? Gittip ? Charge for private repos ? Let’s discuss.

Please try it out, report problems, and contribute patches to make it better.

July 20, 2013

Jose Luis Neder

Patience diff algorithm benefits for darcs

July 20, 2013 10:37 PM UTC

In this post i am going to explain the benefits of Bram Cohen's patience diff algorithm for darcs but first we have to understand how this algorithm works. There are great posts on the web that explain it really well so instead of trying to explain it again i'm going to quote the important things i need to remark the benefits it have for darcs and haskell-like non-curly languages by examples.

A brief summary of what the patience diff algorithm does from Bram Cohen's Blog:

1. Match the first lines of both if they're identical, then match the second, third, etc. until a pair doesn't match.
2. Match the last lines of both if they're identical, then match the next to last, second to last, etc. until a pair doesn't match.
3. Find all lines which occur exactly once on both sides, then do longest common subsequence on those lines, matching them up.
4. Do steps 1-2 on each section between matched lines
From Alfedenzo's Blog:
The common diff algorithm is based on the longest common subsequence problem. Given (in this case) two documents, finding all lines that occur in both, in the same order. That is, making a third document such that every line in the document appears in both of the original documents, and in the same order. Once you have the longest common subsequence, all that remains is to describe the differences between each document and the common document, a much easier problem since the common document is a subset of the other documents.
While the diffs generated by this method are efficient, they tend not to be as human readable.
Patience Diff also relies on the longest common subsequence problem, but takes a different approach. First, it only considers lines that are (a) common to both files, and (b) appear only once in each file. This means that most lines containing a single brace or a new line are ignored, but distinctive lines like a function declaration are retained. Computing the longest common subsequence of the unique elements of both documents leads to a skeleton of common points that almost definitely correspond to each other. The algorithm then sweeps up all contiguous blocks of common lines found in this way, and recurses on those parts that were left out, in the hopes that in this smaller context, some of the lines that were ignored earlier for being non-unique are found to be unique. Once this process is finished, we are left with a common subsequence that more closely corresponds to what humans would identify.

Then when you modify something that is between unique lines like this:
you get this two different patches depending which algorithm is used:
Patience diff:
Myers diff:
In this case, you have one hunk instead of three in the case of unrelated functions, but in the case of doSomething, you still have a separate hunk because of the unique line in common.
Normally the Myers diff should perform bad when some lines are only moved from one place to another like in this case, and i'm glad to say that with the fine tuned myers implementation of darcs this doesn't happen. But it still happens in curly-braced languages, like in this case(from here):
You get this two different diffs:
Patience diff:
Myers Diff:
In theory this could happen with not curly braces if there are non-unique equal lines in a file like this:
Patience Diff:
Myers Diff:
Here you can see that the hunks offered by the patience diff algorithm are more useful and understandable. But this example depends in equal lines hardly found in real cases especially in haskell when the whitespaces aren't necessarily the same between functions as in languages ​​like python.
Usually i would say is better to have smaller hunks that are isolated between functions, because it should avoid dependencies between patches, but then there are more changes to select/unselect and some times it depends on what you think is best to avoid conflicts between patchs. That is why you get the choice to use one algorithm or the other.
You should take in consideration how the algorithm is used.
When you use a command with a diff algorithm flag, the algorithm is used always to calculate the hunks of the actual unrecorded changes. The commands that make this are record, apply, mark-conflicts, pull, unpull, obliterate, revert, unrevert and rebase(suspend, unsuspend, reify, obliterate, inject and pull) . The flag don't change an already saved patch, like one sended or one pushed. Thereby patches to be applied or pulled, are not modified by the diff flag.
When you use the record command, the patch saved depends on the diff algorithm and the hunks manually chosen. The patch is saved in hunks  so when you resolve conflicts between patchs this saved hunks are used.
In the case of unrevert, you should take into account that the patch saved by revert is not affected by the unrevert's diff flag. You can only get a different patch is you use the flag when you make the revert i.e. "darcs revert --patience".